linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* related oops in 2.4.17 and 2.4.19
@ 2002-09-21 16:20 Nicholas Henke
  2002-09-21 16:42 ` [BProc] " Wilton Wong
  0 siblings, 1 reply; 3+ messages in thread
From: Nicholas Henke @ 2002-09-21 16:20 UTC (permalink / raw)
  To: BProc-users; +Cc: linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1037 bytes --]

I am running both a 2.4.17 and 2.4.19 vanilla patched with bproc. I have 2
oops traces from klogd, that appear to be related. I am not sure if this
is a bproc or kernel issue. The kernels have been running on Dell 1550
PIII Dual machines with 2GB ram and 2GB swap and IBM x330s with the same
setups. I have seen about 20 machines toss this same oops in the last few
days, all when under heavy load and memory pressure. After oopsing, the
machine will remain resonsive and can run processes, baring ps, top,
shutdown and reboot -- I am sure there are more that won't run :) I would
greatly appreciate any help -- I have almost 200 machines running these
kernels. The oopsing has not just started recently -- I have seen it all
along, but have never been able to get the decoded info from klogd, and it
is just now becoming a problem with so many machines oopsing.

Attached is one file with 2 oops reports from klogd.
Nic

-- 
Nicholas Henke
Linux cluster system programmer
University of Pennsylvania
henken@seas.upenn.edu - 215.573.8149

[-- Attachment #2: oops traces --]
[-- Type: TEXT/PLAIN, Size: 4454 bytes --]

-------- 2.4.17 --------------

Sep 19 18:49:10 node25 kernel: kernel BUG at page_alloc.c:84!
Sep 19 18:49:10 node25 kernel: invalid operand: 0000
Sep 19 18:49:10 node25 kernel: CPU:    1
Sep 19 18:49:10 node25 kernel: EIP:    0010:[__free_pages_ok+169/832]    Not tainted
Sep 19 18:49:10 node25 kernel: EIP:    0010:[<c013c0d9>]    Not tainted
Sep 19 18:49:10 node25 kernel: EFLAGS: 00010286
Sep 19 18:49:10 node25 kernel: eax: 0000001f   ebx: c1b3ffa0   ecx: c0288424   edx: 000036aa
Sep 19 18:49:10 node25 kernel: esi: c1b3ffa0   edi: 00000000   ebp: 00000000   esp: c5d8dee0
Sep 19 18:49:10 node25 kernel: ds: 0018   es: 0018   ss: 0018
Sep 19 18:49:10 node25 kernel: Process ps (pid: 1326, stackpage=c5d8d000)
Sep 19 18:49:10 node25 kernel: Stack: c025c74d 00000054 ea5a4a41 e947f35c e947f000 ea5a4000 bfff0018 0000090f
Sep 19 18:49:10 node25 kernel:        c1b3ffa0 e947f90f e947f90f c0125607 c5d8c000 e947f000 f21a270c c1b3ffa0
Sep 19 18:49:10 node25 kernel:        e0e9b980 f21a270c e9092000 e947f000 e947f000 c01691fa e9092000 bffff6e5
Sep 19 18:49:10 node25 kernel: Call Trace: [access_process_vm+439/560] [proc_pid_environ+186/208] [proc_info_read+99/288] [filp_open+77/96] [sys_read+150/208]
Sep 19 18:49:10 node25 kernel: Call Trace: [<c0125607>] [<c01691fa>] [<c0169653>] [<c01442ed>] [<c0144e76>]
Sep 19 18:49:10 node25 kernel:    [sys_open+203/304] [system_call+51/56]
Sep 19 18:49:10 node25 kernel:    [<c01446db>] [<c01078ab>]
Sep 19 18:49:10 node25 kernel:
Sep 19 18:49:10 node25 kernel: Code: 0f 0b 5e 5f 8b 43 18 a9 80 00 00 00 74 10 6a 56 68 4d c7 25
Sep 20 04:02:14 node25 syslogd 1.4.1: restart.
Sep 20 14:00:00 node25 sshd(pam_unix)[1655]: session opened for user bindu by (uid=0)
Sep 20 14:27:10 node25 sshd(pam_unix)[1655]: session closed for user bindu
Sep 20 15:22:51 node25 sshd(pam_unix)[1683]: session opened for user root by (uid=0)
Sep 20 15:23:37 node25 sshd(pam_unix)[1731]: session opened for user henken by (uid=0)
Sep 20 15:23:37 node25 sshd(pam_unix)[1731]: session closed for user henken
Sep 20 15:23:44 node25 sshd(pam_unix)[1739]: session opened for user henken by (uid=0)
Sep 20 15:23:44 node25 sshd(pam_unix)[1739]: session closed for user henken

----------- 2.4.19 ------

Sep 21 10:15:34 node25.io.liniac.upenn.edu kernel: Warning - running *really* short on DMA buffers
Sep 21 10:39:10 node25.io.liniac.upenn.edu last message repeated 156 times
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000010
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel:  printing eip:
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: f89cee23
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: *pde = 00000000
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Oops: 0000
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: CPU:    1
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: EIP:    0010:[<f89cee23>]    Not tainted
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: EFLAGS: 00010202
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: eax: 00000000   ebx: 00000000   ecx: 00000002   edx: f70c0000
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: esi: f70c0000   edi: 00000000   ebp: ffffffff   esp: daabbe8c
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: ds: 0018   es: 0018   ss: 0018
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Process ps (pid: 2266, stackpage=daabb000)
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Stack: c015f525 f70c0000 000002cc 000002cc 00000000 ffffffff 00000004 0000002c
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel:        00000000 00000088 00000000 00000002 00000000 00000000 00000000 00000013
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel:        00000000 00000000 00000000 004a190c 00000000 00000000 ffffffff 00000000
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Call Trace:    [proc_pid_stat+629/752] [do_exit+719/736] [proc_info_read+99/288] [sys_read+150/272] [sys_open+87/160]
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Call Trace:    [<c015f525>] [<c011f86f>] [<c015d0a3>] [<c013de76>] [<c013d827>]
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel:   [system_call+51/56]
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel:   [<c0108cfb>]
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel:
Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Code: 8b 40 10 c3 90 8b 82 94 00 00 00 8b 40 7c c3 89 f6 f6 05 60


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [BProc] related oops in 2.4.17 and 2.4.19
  2002-09-21 16:20 related oops in 2.4.17 and 2.4.19 Nicholas Henke
@ 2002-09-21 16:42 ` Wilton Wong
  2002-09-21 16:49   ` Nicholas Henke
  0 siblings, 1 reply; 3+ messages in thread
From: Wilton Wong @ 2002-09-21 16:42 UTC (permalink / raw)
  To: Nicholas Henke; +Cc: BProc-users, linux-kernel

Are you running a symbois/LSI based SCSI controller ? just wondering, becaise
we had a whole bunch of "running short of DMA buffer" errors untill we switched
to the newer sym83cxx driver.. we woulget a whole bunch of these errors after a
random amount of disk activity and then boom we would oops.. 

- Wilton

On Sat, 21 Sep 2002, Nicholas Henke wrote:

> I am running both a 2.4.17 and 2.4.19 vanilla patched with bproc. I have 2
> oops traces from klogd, that appear to be related. I am not sure if this
> is a bproc or kernel issue. The kernels have been running on Dell 1550
> PIII Dual machines with 2GB ram and 2GB swap and IBM x330s with the same
> setups. I have seen about 20 machines toss this same oops in the last few
> days, all when under heavy load and memory pressure. After oopsing, the
> machine will remain resonsive and can run processes, baring ps, top,
> shutdown and reboot -- I am sure there are more that won't run :) I would
> greatly appreciate any help -- I have almost 200 machines running these
> kernels. The oopsing has not just started recently -- I have seen it all
> along, but have never been able to get the decoded info from klogd, and it
> is just now becoming a problem with so many machines oopsing.
> 
> Attached is one file with 2 oops reports from klogd.
> Nic
> 
> -- 
> Nicholas Henke
> Linux cluster system programmer
> University of Pennsylvania
> henken@seas.upenn.edu - 215.573.8149

Content-Description: oops traces
> -------- 2.4.17 --------------
> 
> Sep 19 18:49:10 node25 kernel: kernel BUG at page_alloc.c:84!
> Sep 19 18:49:10 node25 kernel: invalid operand: 0000
> Sep 19 18:49:10 node25 kernel: CPU:    1
> Sep 19 18:49:10 node25 kernel: EIP:    0010:[__free_pages_ok+169/832]    Not tainted
> Sep 19 18:49:10 node25 kernel: EIP:    0010:[<c013c0d9>]    Not tainted
> Sep 19 18:49:10 node25 kernel: EFLAGS: 00010286
> Sep 19 18:49:10 node25 kernel: eax: 0000001f   ebx: c1b3ffa0   ecx: c0288424   edx: 000036aa
> Sep 19 18:49:10 node25 kernel: esi: c1b3ffa0   edi: 00000000   ebp: 00000000   esp: c5d8dee0
> Sep 19 18:49:10 node25 kernel: ds: 0018   es: 0018   ss: 0018
> Sep 19 18:49:10 node25 kernel: Process ps (pid: 1326, stackpage=c5d8d000)
> Sep 19 18:49:10 node25 kernel: Stack: c025c74d 00000054 ea5a4a41 e947f35c e947f000 ea5a4000 bfff0018 0000090f
> Sep 19 18:49:10 node25 kernel:        c1b3ffa0 e947f90f e947f90f c0125607 c5d8c000 e947f000 f21a270c c1b3ffa0
> Sep 19 18:49:10 node25 kernel:        e0e9b980 f21a270c e9092000 e947f000 e947f000 c01691fa e9092000 bffff6e5
> Sep 19 18:49:10 node25 kernel: Call Trace: [access_process_vm+439/560] [proc_pid_environ+186/208] [proc_info_read+99/288] [filp_open+77/96] [sys_read+150/208]
> Sep 19 18:49:10 node25 kernel: Call Trace: [<c0125607>] [<c01691fa>] [<c0169653>] [<c01442ed>] [<c0144e76>]
> Sep 19 18:49:10 node25 kernel:    [sys_open+203/304] [system_call+51/56]
> Sep 19 18:49:10 node25 kernel:    [<c01446db>] [<c01078ab>]
> Sep 19 18:49:10 node25 kernel:
> Sep 19 18:49:10 node25 kernel: Code: 0f 0b 5e 5f 8b 43 18 a9 80 00 00 00 74 10 6a 56 68 4d c7 25
> Sep 20 04:02:14 node25 syslogd 1.4.1: restart.
> Sep 20 14:00:00 node25 sshd(pam_unix)[1655]: session opened for user bindu by (uid=0)
> Sep 20 14:27:10 node25 sshd(pam_unix)[1655]: session closed for user bindu
> Sep 20 15:22:51 node25 sshd(pam_unix)[1683]: session opened for user root by (uid=0)
> Sep 20 15:23:37 node25 sshd(pam_unix)[1731]: session opened for user henken by (uid=0)
> Sep 20 15:23:37 node25 sshd(pam_unix)[1731]: session closed for user henken
> Sep 20 15:23:44 node25 sshd(pam_unix)[1739]: session opened for user henken by (uid=0)
> Sep 20 15:23:44 node25 sshd(pam_unix)[1739]: session closed for user henken
> 
> ----------- 2.4.19 ------
> 
> Sep 21 10:15:34 node25.io.liniac.upenn.edu kernel: Warning - running *really* short on DMA buffers
> Sep 21 10:39:10 node25.io.liniac.upenn.edu last message repeated 156 times
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000010
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel:  printing eip:
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: f89cee23
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: *pde = 00000000
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Oops: 0000
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: CPU:    1
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: EIP:    0010:[<f89cee23>]    Not tainted
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: EFLAGS: 00010202
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: eax: 00000000   ebx: 00000000   ecx: 00000002   edx: f70c0000
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: esi: f70c0000   edi: 00000000   ebp: ffffffff   esp: daabbe8c
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: ds: 0018   es: 0018   ss: 0018
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Process ps (pid: 2266, stackpage=daabb000)
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Stack: c015f525 f70c0000 000002cc 000002cc 00000000 ffffffff 00000004 0000002c
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel:        00000000 00000088 00000000 00000002 00000000 00000000 00000000 00000013
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel:        00000000 00000000 00000000 004a190c 00000000 00000000 ffffffff 00000000
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Call Trace:    [proc_pid_stat+629/752] [do_exit+719/736] [proc_info_read+99/288] [sys_read+150/272] [sys_open+87/160]
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Call Trace:    [<c015f525>] [<c011f86f>] [<c015d0a3>] [<c013de76>] [<c013d827>]
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel:   [system_call+51/56]
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel:   [<c0108cfb>]
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel:
> Sep 21 10:39:10 node25.io.liniac.upenn.edu kernel: Code: 8b 40 10 c3 90 8b 82 94 00 00 00 8b 40 7c c3 89 f6 f6 05 60
> 


 ----[   Wilton William Wong    ]---------------------------------------------
   11060-166 Avenue         Ph : 01-780-456-9771       High Performance UNIX
   Edmonton, Alberta        FAX: 01-780-456-9772        and Linux Solutions
   T5X 1Y3, Canada                   URL: http://www.harddata.com
 -------------------------------------------------------[ Hard Data Ltd. ]----


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [BProc] related oops in 2.4.17 and 2.4.19
  2002-09-21 16:42 ` [BProc] " Wilton Wong
@ 2002-09-21 16:49   ` Nicholas Henke
  0 siblings, 0 replies; 3+ messages in thread
From: Nicholas Henke @ 2002-09-21 16:49 UTC (permalink / raw)
  To: Wilton Wong; +Cc: Nicholas Henke, BProc-users, linux-kernel

nope -- all adaptec:
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.8
        <Adaptec aic7899 Ultra160 SCSI adapter>
        aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs

scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.8
        <Adaptec aic7899 Ultra160 SCSI adapter>
        aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs


On Sat, 21 Sep 2002, Wilton Wong wrote:

> Are you running a symbois/LSI based SCSI controller ? just wondering, becaise
> we had a whole bunch of "running short of DMA buffer" errors untill we switched
> to the newer sym83cxx driver.. we woulget a whole bunch of these errors after a
> random amount of disk activity and then boom we would oops..
>
> - Wilton
>


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2002-09-21 16:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-09-21 16:20 related oops in 2.4.17 and 2.4.19 Nicholas Henke
2002-09-21 16:42 ` [BProc] " Wilton Wong
2002-09-21 16:49   ` Nicholas Henke

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).