linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [2.5.68] Scalability issues
@ 2003-05-04 17:39 Felix von Leitner
  2003-05-04 18:16 ` Michael Buesch
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Felix von Leitner @ 2003-05-04 17:39 UTC (permalink / raw)
  To: linux-kernel

I am running several scalability tests on Linux 2.5.68, all related to
network servers.  My test box is my notebook, Pentium 3 @ 900 MHz, 256
MB RAM.  I run a small http server on it that supports the following
models:

  - epoll
  - sigio
  - poll
  - pthread_create for each connection
  - fork for each connection

epoll works, sigio works as well, poll is slow but works.

pthread_create (using glibc 2.3.2 and gcc 3.2.3, by the way) fails after
creating several threads.  I run different scalability benchmarks; one
fetches the same web page over and over again and sees how many fetches
per second it can get, the other does the same but with a rapidly
increasing amount of "idle" background connections to the server doing
nothing but stealing resources.

I expected fork and pthread_create to fail sooner or later.  Here is
what happened (dmesg):

  Out of Memory: Killed process 51 (sshd).
  artillery-fork: page allocation failure. order:0, mode:0x20
    [this message comes about 50 times]
  Out of Memory: Killed process 52 (zsh).
  spurious 8259A interrupt: IRQ7.
  Out of Memory: Killed process 49 (sshd).
  alloc_area_pte: page already exists
  alloc_area_pte: page already exists
  alloc_area_pte: page already exists
  alloc_area_pte: page already exists
  VFS: Close: file count is 0
  VFS: Close: file count is 0

these don't look very good, but it gets better:

  Unable to handle kernel NULL pointer dereference at virtual address 00000017
  printing eip:
  c014c95b
  *pde = 00000000
  Oops: 0000 [#1]
  CPU:    0
  EIP:    0060:[<c014c95b>]    Tainted: P
  EFLAGS: 00010286
  eax: d241b000   ebx: 00000003   ecx: c037c450   edx: 00000003
  esi: 00000405   edi: c3194620   ebp: 00000021   esp: c379bf60
  ds: 007b   es: 007b   ss: 0068
  Process artillery-fork (pid: 6966, threadinfo=c379a000 task=c37bad80)
  Stack: c0341e60 c3194620 07ffffff 00000405 c3194620 00000021 c011e21c 00000003
	c3194620 c3194620 00000000 4003c904 c37bad80 c011edc4 c319d780 c319d780
	c379a000 c014d557 00000000 00200001 4003c904 c379a000 c011f0b3 00000000
  Call Trace: [<c011e21c>]  [<c011edc4>]  [<c014d557>]  [<c011f0b3>]  [<c0109279>]
  Code: 8b 43 14 85 c0 0f 84 9a 00 00 00 8b 43 10 31 ed 85 c0 74 45

This should not happen, right?  But wait, there's more!

  <3>alloc_area_pte: page already exists
  alloc_area_pte: page already exists
  alloc_area_pte: page already exists
  alloc_area_pte: page already exists

  Unable to handle kernel paging request at virtual address d209bd38
  printing eip:
  c014e30f
  *pde = 09a1b067
  *pte = 00000000
  Oops: 0000 [#2]
  CPU:    0
  EIP:    0060:[<c014e30f>]    Tainted: P
  EFLAGS: 00010287
  eax: d209a000   ebx: 00000000   ecx: 0000074e   edx: c23e4000
  esi: c21e8548   edi: c23e5f64   ebp: 00000000   esp: c23e5f24
  ds: 007b   es: 007b   ss: 0068
  Process artillery-fork (pid: 6894, threadinfo=c23e4000 task=c23e3980)
  Stack: 00000000 c015fabd c0124bc0 c23e3980 c21e8540 c23e5f60 c23e5f64 00000000
	c015fb9a 00000001 c21e8548 c23e5f60 c23e5f64 c23e4000 c23e4000 00000000
	00000000 bffff7c8 00000000 c21e8540 00000001 c015fd60 00000001 c21e8540
  Call Trace: [<c015fabd>]  [<c0124bc0>]  [<c015fb9a>]  [<c015fd60>]  [<c015df65>]  [<c015f0f0>]  [<c0109279>]
  Code: 8b 1c 88 85 db 74 03 ff 43 14 ff 4a 14 8b 42 08 83 e0 08 75

Mhh.  What about this one, then?

 <6>note: artillery-fork[6894] exited with preempt_count 1

  Unable to handle kernel paging request at virtual address d209a000
  printing eip:
  c011e1fd
  *pde = 09a1b067
  *pte = 00000000
  Oops: 0002 [#3]
  CPU:    0
  EIP:    0060:[<c011e1fd>]    Tainted: P
  EFLAGS: 00010246
  eax: d209a000   ebx: ffffffff   ecx: 00000000   edx: 00000000
  esi: 00000000   edi: c22f8380   ebp: 00000001   esp: c23e5de0
  ds: 007b   es: 007b   ss: 0068
  Process artillery-fork (pid: 6894, threadinfo=c23e4000 task=c23e3980)
  Stack: 00000000 c23e3980 c22f8380 00000000 c23e3980 c23e3980 c011edc4 c2299e20 
	c2299e20 00001aee 00000001 c23e4000 c23e5ef0 c23e3980 0000009b c010a2fc 
	0000000b c033dc76 00000000 00000002 00000000 00000000 c0117e3a c033dc76 
  Call Trace: [<c011edc4>]  [<c010a2fc>]  [<c0117e3a>]  [<c029ca2d>]  [<c029cb34>]  [<c010b7ed>]  [<c011c604>]  [<c0117cf0>]  [<c0109cd5>]
    [<c012007b>]  [<c014e30f>]  [<c015fabd>]  [<c0124bc0>]  [<c015fb9a>]  [<c015fd60>]  [<c015df65>]  [<c015f0f0>]  [<c0109279>] 
  Code: 87 14 b0 85 d2 75 0c 46 d1 eb 75 e7 eb bd 90 8d 74 26 00 89 

Obviously, the kernel gets more and more confused.  Finally, another one of
these:

 <6>note: artillery-fork[6894] exited with preempt_count 1

This was only from the fork tests, the pthread_tests wouldn't let me
create enough threads to consume too much memory.  How did the Red Hat
people run those tests with 100000 threads?  Set increased the hard
limits for process count and installed the latest glibc and gcc but I'm
apparently still stuck with the old LinuxThreads.  pthread_create fails
on me after an apparently random number of threads have been created,
it's going from 5 to 600 to 1200 so far.  I was not able to even get
close to the 10000 connection I was aiming for!

So I decided that threads don't work for many concurrent connections and
settled for the other benchmark of mine, that just tries to fetch 10000
times the same page with 10 concurrent connections.  It turns out that
pthread_create will still fail after a while.  All my threads are
created detached, because I considered the "thread zombies" might be a
problem.  ps shows that there are only 4 or so threads running when
pthread_create fails.  How can that be?  The system is otherwise idle.

Felix

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2.5.68] Scalability issues
  2003-05-04 17:39 [2.5.68] Scalability issues Felix von Leitner
@ 2003-05-04 18:16 ` Michael Buesch
  2003-05-04 19:44 ` Felix von Leitner
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Michael Buesch @ 2003-05-04 18:16 UTC (permalink / raw)
  To: Felix von Leitner; +Cc: linux kernel mailing list

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sunday 04 May 2003 19:39, Felix von Leitner wrote:
>   Unable to handle kernel NULL pointer dereference at virtual address
> 00000017 printing eip:
>   c014c95b
>   *pde = 00000000
>   Oops: 0000 [#1]
>   CPU:    0
>   EIP:    0060:[<c014c95b>]    Tainted: P
>   EFLAGS: 00010286
>   eax: d241b000   ebx: 00000003   ecx: c037c450   edx: 00000003
>   esi: 00000405   edi: c3194620   ebp: 00000021   esp: c379bf60
>   ds: 007b   es: 007b   ss: 0068
>   Process artillery-fork (pid: 6966, threadinfo=c379a000 task=c37bad80)
>   Stack: c0341e60 c3194620 07ffffff 00000405 c3194620 00000021 c011e21c
> 00000003 c3194620 c3194620 00000000 4003c904 c37bad80 c011edc4 c319d780
> c319d780 c379a000 c014d557 00000000 00200001 4003c904 c379a000 c011f0b3
> 00000000 Call Trace: [<c011e21c>]  [<c011edc4>]  [<c014d557>]  [<c011f0b3>]
>  [<c0109279>] Code: 8b 43 14 85 c0 0f 84 9a 00 00 00 8b 43 10 31 ed 85 c0
> 74 45

Could you please run ksymoops on these oopses?

>   EIP:    0060:[<c014c95b>]    Tainted: P
What tainted the kernel?

- -- 
Regards Michael Büsch
http://www.8ung.at/tuxsoft
 20:14:28 up  3:44,  4 users,  load average: 1.01, 1.05, 1.01
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+tVjtoxoigfggmSgRAlPeAJ4im7EzpQjk2ZRHk3TS1rECLFcB3wCeNhUy
A3yhshZJpMURwsLgX7hVzrY=
=5IoZ
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2.5.68] Scalability issues
  2003-05-04 17:39 [2.5.68] Scalability issues Felix von Leitner
  2003-05-04 18:16 ` Michael Buesch
@ 2003-05-04 19:44 ` Felix von Leitner
  2003-05-04 20:12   ` David S. Miller
  2003-05-04 20:34   ` Tomas Szepe
  2003-05-06 11:11 ` William Lee Irwin III
  2003-05-06 11:46 ` William Lee Irwin III
  3 siblings, 2 replies; 13+ messages in thread
From: Felix von Leitner @ 2003-05-04 19:44 UTC (permalink / raw)
  To: linux-kernel

Here is the ksymoops output.  The taint came from the nvidia kernel
module, X was not running, so the module did not do anything at the
time.


ksymoops 2.4.1 on i686 2.5.68.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.5.68/ (default)
     -m /usr/src/linux/System.map (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

Error (regular_file): read_ksyms stat /proc/ksyms failed
No modules in ksyms, skipping objects
No ksyms, skipping lsmod
e100: selftest OK.
e100: eth0: Intel(R) PRO/100 Network Connection
e100: eth0 NIC Link is Up 100 Mbps Full duplex
Unable to handle kernel NULL pointer dereference at virtual address 00000017
c014c95b
*pde = 00000000
Oops: 0000 [#1]
CPU:    0
EIP:    0060:[<c014c95b>]    Tainted: P  
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010286
eax: d241b000   ebx: 00000003   ecx: c037c450   edx: 00000003
esi: 00000405   edi: c3194620   ebp: 00000021   esp: c379bf60
ds: 007b   es: 007b   ss: 0068
Stack: c0341e60 c3194620 07ffffff 00000405 c3194620 00000021 c011e21c 00000003 
       c3194620 c3194620 00000000 4003c904 c37bad80 c011edc4 c319d780 c319d780 
       c379a000 c014d557 00000000 00200001 4003c904 c379a000 c011f0b3 00000000 
Call Trace: [<c011e21c>]  [<c011edc4>]  [<c014d557>]  [<c011f0b3>]  [<c0109279>] 
Code: 8b 43 14 85 c0 0f 84 9a 00 00 00 8b 43 10 31 ed 85 c0 74 45 

>>EIP; c014c95b <filp_close+1b/d0>   <=====
Trace; c011e21c <put_files_struct+6c/e0>
Trace; c011edc4 <do_exit+144/400>
Trace; c014d557 <sys_write+47/60>
Trace; c011f0b3 <sys_exit+13/20>
Trace; c0109279 <sysenter_past_esp+52/71>
Code;  c014c95b <filp_close+1b/d0>
00000000 <_EIP>:
Code;  c014c95b <filp_close+1b/d0>   <=====
   0:   8b 43 14                  mov    0x14(%ebx),%eax   <=====
Code;  c014c95e <filp_close+1e/d0>
   3:   85 c0                     test   %eax,%eax
Code;  c014c960 <filp_close+20/d0>
   5:   0f 84 9a 00 00 00         je     a5 <_EIP+0xa5>
Code;  c014c966 <filp_close+26/d0>
   b:   8b 43 10                  mov    0x10(%ebx),%eax
Code;  c014c969 <filp_close+29/d0>
   e:   31 ed                     xor    %ebp,%ebp
Code;  c014c96b <filp_close+2b/d0>
  10:   85 c0                     test   %eax,%eax
Code;  c014c96d <filp_close+2d/d0>
  12:   74 45                     je     59 <_EIP+0x59>

Unable to handle kernel paging request at virtual address d209bd38
c014e30f
*pde = 09a1b067
Oops: 0000 [#2]
CPU:    0
EIP:    0060:[<c014e30f>]    Tainted: P  
EFLAGS: 00010287
eax: d209a000   ebx: 00000000   ecx: 0000074e   edx: c23e4000
esi: c21e8548   edi: c23e5f64   ebp: 00000000   esp: c23e5f24
ds: 007b   es: 007b   ss: 0068
Stack: 00000000 c015fabd c0124bc0 c23e3980 c21e8540 c23e5f60 c23e5f64 00000000 
       c015fb9a 00000001 c21e8548 c23e5f60 c23e5f64 c23e4000 c23e4000 00000000 
       00000000 bffff7c8 00000000 c21e8540 00000001 c015fd60 00000001 c21e8540 
Call Trace: [<c015fabd>]  [<c0124bc0>]  [<c015fb9a>]  [<c015fd60>]  [<c015df65>]  [<c015f0f0>]  [<c0109279>] 
Code: 8b 1c 88 85 db 74 03 ff 43 14 ff 4a 14 8b 42 08 83 e0 08 75 

>>EIP; c014e30f <fget+1f/40>   <=====
Trace; c015fabd <do_pollfd+2d/a0>
Trace; c0124bc0 <process_timeout+0/10>
Trace; c015fb9a <do_poll+6a/d0>
Trace; c015fd60 <sys_poll+160/260>
Trace; c015df65 <do_fcntl+d5/1c0>
Trace; c015f0f0 <__pollwait+0/d0>
Trace; c0109279 <sysenter_past_esp+52/71>
Code;  c014e30f <fget+1f/40>
00000000 <_EIP>:
Code;  c014e30f <fget+1f/40>   <=====
   0:   8b 1c 88                  mov    (%eax,%ecx,4),%ebx   <=====
Code;  c014e312 <fget+22/40>
   3:   85 db                     test   %ebx,%ebx
Code;  c014e314 <fget+24/40>
   5:   74 03                     je     a <_EIP+0xa>
Code;  c014e316 <fget+26/40>
   7:   ff 43 14                  incl   0x14(%ebx)
Code;  c014e319 <fget+29/40>
   a:   ff 4a 14                  decl   0x14(%edx)
Code;  c014e31c <fget+2c/40>
   d:   8b 42 08                  mov    0x8(%edx),%eax
Code;  c014e31f <fget+2f/40>
  10:   83 e0 08                  and    $0x8,%eax
Code;  c014e322 <fget+32/40>
  13:   75 00                     jne    15 <_EIP+0x15>

Unable to handle kernel paging request at virtual address d209a000
c011e1fd
*pde = 09a1b067
Oops: 0002 [#3]
CPU:    0
EIP:    0060:[<c011e1fd>]    Tainted: P  
EFLAGS: 00010246
eax: d209a000   ebx: ffffffff   ecx: 00000000   edx: 00000000
esi: 00000000   edi: c22f8380   ebp: 00000001   esp: c23e5de0
ds: 007b   es: 007b   ss: 0068
Stack: 00000000 c23e3980 c22f8380 00000000 c23e3980 c23e3980 c011edc4 c2299e20 
       c2299e20 00001aee 00000001 c23e4000 c23e5ef0 c23e3980 0000009b c010a2fc 
       0000000b c033dc76 00000000 00000002 00000000 00000000 c0117e3a c033dc76 
Call Trace: [<c011edc4>]  [<c010a2fc>]  [<c0117e3a>]  [<c029ca2d>]  [<c029cb34>]  [<c010b7ed>]  [<c011c604>]  [<c0117cf0>]  [<c0109cd5>]  [<c012007b>]  [<c014e30f>]  [<c015fabd>]  [<c0124bc0>]  [<c015fb9a>]  [<c015fd60>]  [<c015df65>]  [<c015f0f0>]  [<c0109279>] 
Code: 87 14 b0 85 d2 75 0c 46 d1 eb 75 e7 eb bd 90 8d 74 26 00 89 

>>EIP; c011e1fd <put_files_struct+4d/e0>   <=====
Trace; c011edc4 <do_exit+144/400>
Trace; c010a2fc <die+ec/f0>
Trace; c0117e3a <do_page_fault+14a/45e>
Trace; c029ca2d <process_backlog+6d/100>
Trace; c029cb34 <net_rx_action+74/120>
Trace; c010b7ed <do_IRQ+fd/120>
Trace; c011c604 <__mmdrop+34/46>
Trace; c0117cf0 <do_page_fault+0/45e>
Trace; c0109cd5 <error_code+2d/38>
Trace; c012007b <sys_settimeofday+1b/f0>
Trace; c014e30f <fget+1f/40>
Trace; c015fabd <do_pollfd+2d/a0>
Trace; c0124bc0 <process_timeout+0/10>
Trace; c015fb9a <do_poll+6a/d0>
Trace; c015fd60 <sys_poll+160/260>
Trace; c015df65 <do_fcntl+d5/1c0>
Trace; c015f0f0 <__pollwait+0/d0>
Trace; c0109279 <sysenter_past_esp+52/71>
Code;  c011e1fd <put_files_struct+4d/e0>
00000000 <_EIP>:
Code;  c011e1fd <put_files_struct+4d/e0>   <=====
   0:   87 14 b0                  xchg   %edx,(%eax,%esi,4)   <=====
Code;  c011e200 <put_files_struct+50/e0>
   3:   85 d2                     test   %edx,%edx
Code;  c011e202 <put_files_struct+52/e0>
   5:   75 0c                     jne    13 <_EIP+0x13>
Code;  c011e204 <put_files_struct+54/e0>
   7:   46                        inc    %esi
Code;  c011e205 <put_files_struct+55/e0>
   8:   d1 eb                     shr    %ebx
Code;  c011e207 <put_files_struct+57/e0>
   a:   75 e7                     jne    fffffff3 <_EIP+0xfffffff3>
Code;  c011e209 <put_files_struct+59/e0>
   c:   eb bd                     jmp    ffffffcb <_EIP+0xffffffcb>
Code;  c011e20b <put_files_struct+5b/e0>
   e:   90                        nop    
Code;  c011e20c <put_files_struct+5c/e0>
   f:   8d 74 26 00               lea    0x0(%esi,1),%esi
Code;  c011e210 <put_files_struct+60/e0>
  13:   89 00                     mov    %eax,(%eax)

Unable to handle kernel paging request at virtual address d2728188
c014e30f
*pde = 08540067
Oops: 0000 [#4]
CPU:    0
EIP:    0060:[<c014e30f>]    Tainted: P  
EFLAGS: 00010283
eax: d2726000   ebx: 00000000   ecx: 00000862   edx: ca208000
esi: ca271c48   edi: ca209f64   ebp: 00000000   esp: ca209f24
ds: 007b   es: 007b   ss: 0068
Stack: 00000000 c015fabd c0124bc0 ca219360 ca271c40 ca209f60 ca209f64 00000000 
       c015fb9a 00000001 ca271c48 ca209f60 ca209f64 ca208000 ca208000 00000000 
       00000000 bffff7c8 00000000 ca271c40 00000001 c015fd60 00000001 ca271c40 
Call Trace: [<c015fabd>]  [<c0124bc0>]  [<c015fb9a>]  [<c015fd60>]  [<c015df65>]  [<c015f0f0>]  [<c0109279>] 
Code: 8b 1c 88 85 db 74 03 ff 43 14 ff 4a 14 8b 42 08 83 e0 08 75 

>>EIP; c014e30f <fget+1f/40>   <=====
Trace; c015fabd <do_pollfd+2d/a0>
Trace; c0124bc0 <process_timeout+0/10>
Trace; c015fb9a <do_poll+6a/d0>
Trace; c015fd60 <sys_poll+160/260>
Trace; c015df65 <do_fcntl+d5/1c0>
Trace; c015f0f0 <__pollwait+0/d0>
Trace; c0109279 <sysenter_past_esp+52/71>
Code;  c014e30f <fget+1f/40>
00000000 <_EIP>:
Code;  c014e30f <fget+1f/40>   <=====
   0:   8b 1c 88                  mov    (%eax,%ecx,4),%ebx   <=====
Code;  c014e312 <fget+22/40>
   3:   85 db                     test   %ebx,%ebx
Code;  c014e314 <fget+24/40>
   5:   74 03                     je     a <_EIP+0xa>
Code;  c014e316 <fget+26/40>
   7:   ff 43 14                  incl   0x14(%ebx)
Code;  c014e319 <fget+29/40>
   a:   ff 4a 14                  decl   0x14(%edx)
Code;  c014e31c <fget+2c/40>
   d:   8b 42 08                  mov    0x8(%edx),%eax
Code;  c014e31f <fget+2f/40>
  10:   83 e0 08                  and    $0x8,%eax
Code;  c014e322 <fget+32/40>
  13:   75 00                     jne    15 <_EIP+0x15>

Unable to handle kernel paging request at virtual address d2726000
c011e1fd
*pde = 08540067
Oops: 0002 [#5]
CPU:    0
EIP:    0060:[<c011e1fd>]    Tainted: P  
EFLAGS: 00010246
eax: d2726000   ebx: ffffffff   ecx: 00000001   edx: 00000000
esi: 00000000   edi: ca229dc0   ebp: 00000001   esp: ca209de0
ds: 007b   es: 007b   ss: 0068
Stack: 00000000 ca219360 ca229dc0 00000000 ca219360 ca219360 c011edc4 c9830320 
       c9830320 00001c02 00000001 ca208000 ca209ef0 ca219360 00000328 c010a2fc 
       0000000b c033dc76 00000000 00000004 00000000 00000000 c0117e3a c033dc76 
Call Trace: [<c011edc4>]  [<c010a2fc>]  [<c0117e3a>]  [<c010b7ed>]  [<c0136561>]  [<c01366a0>]  [<c011c604>]  [<c0117cf0>]  [<c0109cd5>]  [<c012007b>]  [<c014e30f>]  [<c015fabd>]  [<c0124bc0>]  [<c015fb9a>]  [<c015fd60>]  [<c015df65>]  [<c015f0f0>]  [<c0109279>] 
Code: 87 14 b0 85 d2 75 0c 46 d1 eb 75 e7 eb bd 90 8d 74 26 00 89 

>>EIP; c011e1fd <put_files_struct+4d/e0>   <=====
Trace; c011edc4 <do_exit+144/400>
Trace; c010a2fc <die+ec/f0>
Trace; c0117e3a <do_page_fault+14a/45e>
Trace; c010b7ed <do_IRQ+fd/120>
Trace; c0136561 <buffered_rmqueue+b1/150>
Trace; c01366a0 <__alloc_pages+a0/2d0>
Trace; c011c604 <__mmdrop+34/46>
Trace; c0117cf0 <do_page_fault+0/45e>
Trace; c0109cd5 <error_code+2d/38>
Trace; c012007b <sys_settimeofday+1b/f0>
Trace; c014e30f <fget+1f/40>
Trace; c015fabd <do_pollfd+2d/a0>
Trace; c0124bc0 <process_timeout+0/10>
Trace; c015fb9a <do_poll+6a/d0>
Trace; c015fd60 <sys_poll+160/260>
Trace; c015df65 <do_fcntl+d5/1c0>
Trace; c015f0f0 <__pollwait+0/d0>
Trace; c0109279 <sysenter_past_esp+52/71>
Code;  c011e1fd <put_files_struct+4d/e0>
00000000 <_EIP>:
Code;  c011e1fd <put_files_struct+4d/e0>   <=====
   0:   87 14 b0                  xchg   %edx,(%eax,%esi,4)   <=====
Code;  c011e200 <put_files_struct+50/e0>
   3:   85 d2                     test   %edx,%edx
Code;  c011e202 <put_files_struct+52/e0>
   5:   75 0c                     jne    13 <_EIP+0x13>
Code;  c011e204 <put_files_struct+54/e0>
   7:   46                        inc    %esi
Code;  c011e205 <put_files_struct+55/e0>
   8:   d1 eb                     shr    %ebx
Code;  c011e207 <put_files_struct+57/e0>
   a:   75 e7                     jne    fffffff3 <_EIP+0xfffffff3>
Code;  c011e209 <put_files_struct+59/e0>
   c:   eb bd                     jmp    ffffffcb <_EIP+0xffffffcb>
Code;  c011e20b <put_files_struct+5b/e0>
   e:   90                        nop    
Code;  c011e20c <put_files_struct+5c/e0>
   f:   8d 74 26 00               lea    0x0(%esi,1),%esi
Code;  c011e210 <put_files_struct+60/e0>
  13:   89 00                     mov    %eax,(%eax)


1 warning and 1 error issued.  Results may not be reliable.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2.5.68] Scalability issues
  2003-05-04 19:44 ` Felix von Leitner
@ 2003-05-04 20:12   ` David S. Miller
  2003-05-05  7:51     ` Felix von Leitner
  2003-05-04 20:34   ` Tomas Szepe
  1 sibling, 1 reply; 13+ messages in thread
From: David S. Miller @ 2003-05-04 20:12 UTC (permalink / raw)
  To: Felix von Leitner; +Cc: linux-kernel

On Sun, 2003-05-04 at 12:44, Felix von Leitner wrote:
> Here is the ksymoops output.  The taint came from the nvidia kernel
> module, X was not running, so the module did not do anything at the
> time.

Not true, if it got loaded it did something.

Either reproduce without the nvidia module loaded, or take
your report to nvidia.

-- 
David S. Miller <davem@redhat.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2.5.68] Scalability issues
  2003-05-04 19:44 ` Felix von Leitner
  2003-05-04 20:12   ` David S. Miller
@ 2003-05-04 20:34   ` Tomas Szepe
  1 sibling, 0 replies; 13+ messages in thread
From: Tomas Szepe @ 2003-05-04 20:34 UTC (permalink / raw)
  To: Felix von Leitner; +Cc: linux-kernel

> [felix-kernel@fefe.de]
> 
> Here is the ksymoops output.  The taint came from the nvidia kernel
> module, X was not running, so the module did not do anything at the
> time.

Please reproduce those without ever loading the NVidia module
so as to rule out its random scribbling over kernel memory.

-TS

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2.5.68] Scalability issues
  2003-05-04 20:12   ` David S. Miller
@ 2003-05-05  7:51     ` Felix von Leitner
  2003-05-05  9:51       ` Carl-Daniel Hailfinger
  2003-05-05 14:05       ` Chris Friesen
  0 siblings, 2 replies; 13+ messages in thread
From: Felix von Leitner @ 2003-05-05  7:51 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel

Thus spake David S. Miller (davem@redhat.com):
> > Here is the ksymoops output.  The taint came from the nvidia kernel
> > module, X was not running, so the module did not do anything at the
> > time.
> Not true, if it got loaded it did something.

> Either reproduce without the nvidia module loaded, or take
> your report to nvidia.

Thank you for this stunning display of unprofessionalism and zealotry.
People like you keep free software alive.


ksymoops 2.4.1 on i686 2.5.68.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.5.68/ (default)
     -m /usr/src/linux/System.map (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

Error (regular_file): read_ksyms stat /proc/ksyms failed
No modules in ksyms, skipping objects
No ksyms, skipping lsmod
e100: selftest OK.
e100: eth0: Intel(R) PRO/100 Network Connection
e100: eth0 NIC Link is Up 100 Mbps Full duplex
Unable to handle kernel paging request at virtual address d2f83908
c014e30f
*pde = 06608067
Oops: 0000 [#1]
CPU:    0
EIP:    0060:[<c014e30f>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010287
eax: d2f81000   ebx: 00000000   ecx: 00000a42   edx: c5920000
esi: c5a2f4e8   edi: c5921f64   ebp: 00000000   esp: c5921f24
ds: 007b   es: 007b   ss: 0068
Stack: 00000000 c015fabd c0124bc0 c5950680 c5a2f4e0 c5921f60 c5921f64 00000000 
       c015fb9a 00000001 c5a2f4e8 c5921f60 c5921f64 c5920000 c5920000 00000000 
       00000000 bffff808 00000000 c5a2f4e0 00000001 c015fd60 00000001 c5a2f4e0 
Call Trace: [<c015fabd>]  [<c0124bc0>]  [<c015fb9a>]  [<c015fd60>]  [<c015df65>]  [<c015f0f0>]  [<c0109279>] 
Code: 8b 1c 88 85 db 74 03 ff 43 14 ff 4a 14 8b 42 08 83 e0 08 75 

>>EIP; c014e30f <fget+1f/40>   <=====
Trace; c015fabd <do_pollfd+2d/a0>
Trace; c0124bc0 <process_timeout+0/10>
Trace; c015fb9a <do_poll+6a/d0>
Trace; c015fd60 <sys_poll+160/260>
Trace; c015df65 <do_fcntl+d5/1c0>
Trace; c015f0f0 <__pollwait+0/d0>
Trace; c0109279 <sysenter_past_esp+52/71>
Code;  c014e30f <fget+1f/40>
00000000 <_EIP>:
Code;  c014e30f <fget+1f/40>   <=====
   0:   8b 1c 88                  mov    (%eax,%ecx,4),%ebx   <=====
Code;  c014e312 <fget+22/40>
   3:   85 db                     test   %ebx,%ebx
Code;  c014e314 <fget+24/40>
   5:   74 03                     je     a <_EIP+0xa>
Code;  c014e316 <fget+26/40>
   7:   ff 43 14                  incl   0x14(%ebx)
Code;  c014e319 <fget+29/40>
   a:   ff 4a 14                  decl   0x14(%edx)
Code;  c014e31c <fget+2c/40>
   d:   8b 42 08                  mov    0x8(%edx),%eax
Code;  c014e31f <fget+2f/40>
  10:   83 e0 08                  and    $0x8,%eax
Code;  c014e322 <fget+32/40>
  13:   75 00                     jne    15 <_EIP+0x15>

Unable to handle kernel paging request at virtual address d2f81000
c011e1fd
*pde = 06608067
Oops: 0002 [#2]
CPU:    0
EIP:    0060:[<c011e1fd>]    Not tainted
EFLAGS: 00010246
eax: d2f81000   ebx: ffffffff   ecx: 00000001   edx: 00000000
esi: 00000000   edi: c597e900   ebp: 00000001   esp: c5921de0
ds: 007b   es: 007b   ss: 0068
Stack: 00000000 c5950680 c597e900 00000000 c5950680 c5950680 c011edc4 c595ae20 
       c595ae20 00000a9a 00000001 c5920000 c5921ef0 c5950680 00000383 c010a2fc 
       0000000b c033dc76 00000000 00000001 00000000 00000000 c0117e3a c033dc76 
Call Trace: [<c011edc4>]  [<c010a2fc>]  [<c0117e3a>]  [<c0136561>]  [<c0136561>]  [<c01366a0>]  [<c0117cf0>]  [<c0109cd5>]  [<c012007b>]  [<c014e30f>]  [<c015fabd>]  [<c0124bc0>]  [<c015fb9a>]  [<c015fd60>]  [<c015df65>]  [<c015f0f0>]  [<c0109279>] 
Code: 87 14 b0 85 d2 75 0c 46 d1 eb 75 e7 eb bd 90 8d 74 26 00 89 

>>EIP; c011e1fd <put_files_struct+4d/e0>   <=====
Trace; c011edc4 <do_exit+144/400>
Trace; c010a2fc <die+ec/f0>
Trace; c0117e3a <do_page_fault+14a/45e>
Trace; c0136561 <buffered_rmqueue+b1/150>
Trace; c0136561 <buffered_rmqueue+b1/150>
Trace; c01366a0 <__alloc_pages+a0/2d0>
Trace; c0117cf0 <do_page_fault+0/45e>
Trace; c0109cd5 <error_code+2d/38>
Trace; c012007b <sys_settimeofday+1b/f0>
Trace; c014e30f <fget+1f/40>
Trace; c015fabd <do_pollfd+2d/a0>
Trace; c0124bc0 <process_timeout+0/10>
Trace; c015fb9a <do_poll+6a/d0>
Trace; c015fd60 <sys_poll+160/260>
Trace; c015df65 <do_fcntl+d5/1c0>
Trace; c015f0f0 <__pollwait+0/d0>
Trace; c0109279 <sysenter_past_esp+52/71>
Code;  c011e1fd <put_files_struct+4d/e0>
00000000 <_EIP>:
Code;  c011e1fd <put_files_struct+4d/e0>   <=====
   0:   87 14 b0                  xchg   %edx,(%eax,%esi,4)   <=====
Code;  c011e200 <put_files_struct+50/e0>
   3:   85 d2                     test   %edx,%edx
Code;  c011e202 <put_files_struct+52/e0>
   5:   75 0c                     jne    13 <_EIP+0x13>
Code;  c011e204 <put_files_struct+54/e0>
   7:   46                        inc    %esi
Code;  c011e205 <put_files_struct+55/e0>
   8:   d1 eb                     shr    %ebx
Code;  c011e207 <put_files_struct+57/e0>
   a:   75 e7                     jne    fffffff3 <_EIP+0xfffffff3>
Code;  c011e209 <put_files_struct+59/e0>
   c:   eb bd                     jmp    ffffffcb <_EIP+0xffffffcb>
Code;  c011e20b <put_files_struct+5b/e0>
   e:   90                        nop    
Code;  c011e20c <put_files_struct+5c/e0>
   f:   8d 74 26 00               lea    0x0(%esi,1),%esi
Code;  c011e210 <put_files_struct+60/e0>
  13:   89 00                     mov    %eax,(%eax)


1 warning and 1 error issued.  Results may not be reliable.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2.5.68] Scalability issues
  2003-05-05  7:51     ` Felix von Leitner
@ 2003-05-05  9:51       ` Carl-Daniel Hailfinger
  2003-05-05 14:05       ` Chris Friesen
  1 sibling, 0 replies; 13+ messages in thread
From: Carl-Daniel Hailfinger @ 2003-05-05  9:51 UTC (permalink / raw)
  To: Felix von Leitner; +Cc: David S. Miller, linux-kernel

Felix von Leitner wrote:
> Thus spake David S. Miller (davem@redhat.com):
> 
>>>Here is the ksymoops output.  The taint came from the nvidia kernel
>>>module, X was not running, so the module did not do anything at the
>>>time.
>>
>>Not true, if it got loaded it did something.
>>
>>Either reproduce without the nvidia module loaded, or take
>>your report to nvidia.
> 
> Thank you for this stunning display of unprofessionalism and zealotry.

No, we have to thank you for that.

> People like you keep free software alive.

Yes indeed.
me@linux:~> grep -iC4 "davem" MAINTAINERS
 CRYPTO API
 P:     James Morris
 M:     jmorris@intercode.com.au
 P:     David S. Miller
 M:     davem@redhat.com
 W      http://samba.org/~jamesm/crypto/
 L:     linux-kernel@vger.kernel.org
 S:     Maintained

--
 S:     Maintained

 NETWORKING [IPv4/IPv6]
 P:     David S. Miller
 M:     davem@redhat.com
 P:     Alexey Kuznetsov
 M:     kuznet@ms2.inr.ac.ru
 P:     Pekka Savola (ipv6)
 M:     pekkas@netcore.fi
--
 S:     Maintained

 UltraSPARC (sparc64):
 P:     David S. Miller
 M:     davem@redhat.com
 P:     Eddie C. Dost
 M:     ecd@skynet.be
 P:     Jakub Jelinek
 M:     jj@sunsite.ms.mff.cuni.cz
me@linux:~> grep -iC4 "felix" MAINTAINERS CREDITS
me@linux:~> grep -iC4 "fefe" MAINTAINERS CREDITS
me@linux:~> grep -iC4 "leitner" MAINTAINERS CREDITS
me@linux:~>

Over and out.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2.5.68] Scalability issues
  2003-05-05  7:51     ` Felix von Leitner
  2003-05-05  9:51       ` Carl-Daniel Hailfinger
@ 2003-05-05 14:05       ` Chris Friesen
  2003-05-06 18:44         ` Bill Davidsen
  1 sibling, 1 reply; 13+ messages in thread
From: Chris Friesen @ 2003-05-05 14:05 UTC (permalink / raw)
  To: Felix von Leitner; +Cc: linux-kernel

Felix von Leitner wrote:
> Thus spake David S. Miller (davem@redhat.com):

>>Either reproduce without the nvidia module loaded, or take
>>your report to nvidia.
>>
> 
> Thank you for this stunning display of unprofessionalism and zealotry.
> People like you keep free software alive.

He may not have put it as politely as you would like, but there really is no way 
to debug a problem in a kernel which has been tainted by binary-only drivers. 
That driver could have done literally anything to the kernel on loading.

Chris




-- 
Chris Friesen                    | MailStop: 043/33/F10
Nortel Networks                  | work: (613) 765-0557
3500 Carling Avenue              | fax:  (613) 765-2986
Nepean, ON K2H 8E9 Canada        | email: cfriesen@nortelnetworks.com


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2.5.68] Scalability issues
  2003-05-04 17:39 [2.5.68] Scalability issues Felix von Leitner
  2003-05-04 18:16 ` Michael Buesch
  2003-05-04 19:44 ` Felix von Leitner
@ 2003-05-06 11:11 ` William Lee Irwin III
  2003-05-06 11:46 ` William Lee Irwin III
  3 siblings, 0 replies; 13+ messages in thread
From: William Lee Irwin III @ 2003-05-06 11:11 UTC (permalink / raw)
  To: Felix von Leitner; +Cc: linux-kernel

On Sun, May 04, 2003 at 07:39:56PM +0200, Felix von Leitner wrote:
>   Out of Memory: Killed process 51 (sshd).
>   artillery-fork: page allocation failure. order:0, mode:0x20
>     [this message comes about 50 times]
>   Out of Memory: Killed process 52 (zsh).
>   spurious 8259A interrupt: IRQ7.
>   Out of Memory: Killed process 49 (sshd).
>   alloc_area_pte: page already exists
>   alloc_area_pte: page already exists
>   alloc_area_pte: page already exists
>   alloc_area_pte: page already exists
>   VFS: Close: file count is 0
>   VFS: Close: file count is 0

Could I get a list of devices on your system and the drivers you're using?
A .config and a list of out-of-tree modules (ignore nvidia you already
reproduced without it) would be nice, too.

Thanks.

-- wli

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2.5.68] Scalability issues
  2003-05-04 17:39 [2.5.68] Scalability issues Felix von Leitner
                   ` (2 preceding siblings ...)
  2003-05-06 11:11 ` William Lee Irwin III
@ 2003-05-06 11:46 ` William Lee Irwin III
  3 siblings, 0 replies; 13+ messages in thread
From: William Lee Irwin III @ 2003-05-06 11:46 UTC (permalink / raw)
  To: Felix von Leitner; +Cc: linux-kernel

On Sun, May 04, 2003 at 07:39:56PM +0200, Felix von Leitner wrote:
> I am running several scalability tests on Linux 2.5.68, all related to
> network servers.  My test box is my notebook, Pentium 3 @ 900 MHz, 256
> MB RAM.  I run a small http server on it that supports the following
> models:

I think I found it already; it's going to take a while to produce a fix
for what turned up during my audit.


-- wli

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2.5.68] Scalability issues
  2003-05-05 14:05       ` Chris Friesen
@ 2003-05-06 18:44         ` Bill Davidsen
  2003-05-06 20:36           ` Timothy Miller
  2003-05-07  0:22           ` Carl-Daniel Hailfinger
  0 siblings, 2 replies; 13+ messages in thread
From: Bill Davidsen @ 2003-05-06 18:44 UTC (permalink / raw)
  To: Chris Friesen; +Cc: Felix von Leitner, linux-kernel

On Mon, 5 May 2003, Chris Friesen wrote:

> Felix von Leitner wrote:
> > Thus spake David S. Miller (davem@redhat.com):
> 
> >>Either reproduce without the nvidia module loaded, or take
> >>your report to nvidia.
> >>
> > 
> > Thank you for this stunning display of unprofessionalism and zealotry.
> > People like you keep free software alive.
> 
> He may not have put it as politely as you would like, but there really is no way 
> to debug a problem in a kernel which has been tainted by binary-only drivers. 
> That driver could have done literally anything to the kernel on loading.

There's no need to be rude in any case, particularly after the OP reposted
a not tainted oops which had been through ksymoops and didn't get any help
anyway. Why be nasty about the format of a question you're not answering
even after it's been asked again in the preferred format?

It's a shame that some people seem to think that lots of hard work
entitles them to be rude and condescending, while really important
contributors like Alan Cox, Ingo and akpm can be polite and helpful, even
when they are correcting someone or disagreeing on an approach to a
problem.

-- 
bill davidsen <davidsen@tmr.com>
  CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2.5.68] Scalability issues
  2003-05-06 18:44         ` Bill Davidsen
@ 2003-05-06 20:36           ` Timothy Miller
  2003-05-07  0:22           ` Carl-Daniel Hailfinger
  1 sibling, 0 replies; 13+ messages in thread
From: Timothy Miller @ 2003-05-06 20:36 UTC (permalink / raw)
  To: Bill Davidsen, Linux Kernel Mailing List



Bill Davidsen wrote:

> 
> It's a shame that some people seem to think that lots of hard work
> entitles them to be rude and condescending, while really important
> contributors like Alan Cox, Ingo and akpm can be polite and helpful, even
> when they are correcting someone or disagreeing on an approach to a
> problem.
> 

I read an article a while back by Paul Graham (LISP guru, SPAM filterer, 
etc.) about Junior High School social dynamics.  The people at the top 
are confident, while the people in the middle want to climb the 
popularity ladder and do so by pushing others down.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [2.5.68] Scalability issues
  2003-05-06 18:44         ` Bill Davidsen
  2003-05-06 20:36           ` Timothy Miller
@ 2003-05-07  0:22           ` Carl-Daniel Hailfinger
  1 sibling, 0 replies; 13+ messages in thread
From: Carl-Daniel Hailfinger @ 2003-05-07  0:22 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Chris Friesen, linux-kernel

Bill Davidsen wrote:
> On Mon, 5 May 2003, Chris Friesen wrote:
> 
> 
>>Felix von Leitner wrote:
>>
>>>Thus spake David S. Miller (davem@redhat.com):
>>
>>>>Either reproduce without the nvidia module loaded, or take
>>>>your report to nvidia.
>>>
>>>Thank you for this stunning display of unprofessionalism and zealotry.
>>>People like you keep free software alive.
>>
>>He may not have put it as politely as you would like, but there really is no way 
>>to debug a problem in a kernel which has been tainted by binary-only drivers. 
>>That driver could have done literally anything to the kernel on loading.
> 
> There's no need to be rude in any case, particularly after the OP reposted
> a not tainted oops which had been through ksymoops and didn't get any help
> anyway. Why be nasty about the format of a question you're not answering
> even after it's been asked again in the preferred format?

Because the OP violated the lkml FAQ section 1.18:
All problems discovered whilst such a module is loaded must be reported
to the vendor of that module, /not/ the Linux kernel hackers and the
linux-kernel mailing list. [...] "oops" reports marked as tainted are of
no use to the kernel developers and will be ignored.

Davem just restated this fact with the same admittedly strong wording.
Felix von Leitner accused him of unprofessionalism and zealotry. That is
what I would call an offence.

> It's a shame that some people seem to think that lots of hard work
> entitles them to be rude and condescending, while really important
> contributors like Alan Cox, Ingo and akpm can be polite and helpful, even
> when they are correcting someone or disagreeing on an approach to a
> problem.

It's even more shocking if a user insults a kernel developer and expects
this developer (or one of his peers) to actually take care of the
problem. wli chose to investigate the report anyway, something not to be
taken for granted.

Hey, if I insulted Al Viro I'd never expect him to help me (respond,
point out mistakes etc.) anymore. Besides that, Al saved me from diving
into floppy.c, for which I'm still thankful.

Chris: Just a heads up - you may get private hate mail from Peter
"Firefly" Lund like I did because you pointed out a mistake of the OP.
So be warned about it.


Carl-Daniel


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2003-05-07  0:10 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-05-04 17:39 [2.5.68] Scalability issues Felix von Leitner
2003-05-04 18:16 ` Michael Buesch
2003-05-04 19:44 ` Felix von Leitner
2003-05-04 20:12   ` David S. Miller
2003-05-05  7:51     ` Felix von Leitner
2003-05-05  9:51       ` Carl-Daniel Hailfinger
2003-05-05 14:05       ` Chris Friesen
2003-05-06 18:44         ` Bill Davidsen
2003-05-06 20:36           ` Timothy Miller
2003-05-07  0:22           ` Carl-Daniel Hailfinger
2003-05-04 20:34   ` Tomas Szepe
2003-05-06 11:11 ` William Lee Irwin III
2003-05-06 11:46 ` William Lee Irwin III

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).