linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Linux-2.4.20-pre8-aa2 oops report.
@ 2002-10-05  2:47 Srihari Vijayaraghavan
  2002-10-05  3:09 ` Srihari Vijayaraghavan
  0 siblings, 1 reply; 7+ messages in thread
From: Srihari Vijayaraghavan @ 2002-10-05  2:47 UTC (permalink / raw)
  To: linux-kernel

[1.] One line summary of the problem:
	2.4.20-pre8aa2 Kernel oopsed couple of times.

[2.] Full description of the problem/report:
	Same as above.

[3.] Keywords (i.e., modules, networking, kernel):
	I am no kernel developer, but I suspect it may be due to XFree86, 
AGPGART/DRM/Radeon. I may be wrong though, please feel to correct me.

[4.] Kernel version (from /proc/version):
Linux version 2.4.20-pre8aa2 (hari@localhost.localdomain) (gcc version 3.2 
20020903 (Red Hat Linux 8.0 3.2-7)) #3 Thu Oct 3 21:07:54 EST 2002

[5.] Output of Oops.. message (if applicable) with symbolic information 
     resolved (see Documentation/oops-tracing.txt)
ksymoops 2.4.5 on i686 2.4.20-pre8aa2.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.20-pre8aa2/ (default)
     -m /boot/System.map-2.4.20-pre8aa2 (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

Oct  5 11:46:39 localhost kernel: kernel BUG at memory.c:419!
Oct  5 11:46:39 localhost kernel: invalid operand: 0000 2.4.20-pre8aa2 #3 Thu 
Oct 3 21:07:54 EST 2002
Oct  5 11:46:39 localhost kernel: CPU:    0
Oct  5 11:46:39 localhost kernel: EIP:    0010:[<c01270f6>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
Oct  5 11:46:39 localhost kernel: EFLAGS: 00210246
Oct  5 11:46:39 localhost kernel: eax: cb988000   ebx: 00000000   ecx: 
cabe4740   edx: 00000000
Oct  5 11:46:39 localhost kernel: esi: cb988000   edi: 00000000   ebp: 
00000000   esp: cbcfde84
Oct  5 11:46:39 localhost kernel: ds: 0018   es: 0018   ss: 0018
Oct  5 11:46:39 localhost kernel: Process gnome-session (pid: 5481, 
stackpage=cbcfd000)
Oct  5 11:46:39 localhost kernel: Stack: cabe4740 cb988400 00200292 00003000 
da97e4c0 00000000 cabe4740 00000000 
Oct  5 11:46:39 localhost kernel:        c012a5b5 cabe4740 00000000 00000000 
00000000 cabe4740 cbcfc000 cbcfdf30 
Oct  5 11:46:39 localhost kernel:        0000000b c0116a36 cabe4740 00200202 
cabe4740 c011b807 cabe4740 c158f270 
Oct  5 11:46:39 localhost kernel: Call Trace:    [<c012a5b5>] [<c0116a36>] 
[<c011b807>] [<c01213cc>] [<c01215a4>]
Oct  5 11:46:39 localhost kernel:   [<c0108c54>] [<c0113c60>] [<c0108f38>]
Oct  5 11:46:39 localhost kernel: Code: 0f 0b a3 01 42 4c 1f c0 89 f6 8b 44 24 
24 89 74 24 04 89 5c 


>>EIP; c01270f6 <zap_page_range+26/b0>   <=====

>>eax; cb988000 <END_OF_CODE+53341a9/????>
>>ecx; cabe4740 <END_OF_CODE+45908e9/????>
>>esi; cb988000 <END_OF_CODE+53341a9/????>
>>esp; cbcfde84 <END_OF_CODE+56aa02d/????>

Trace; c012a5b5 <exit_mmap+b5/130>
Trace; c0116a36 <mmput+56/d0>
Trace; c011b807 <do_exit+87/260>
Trace; c01213cc <sig_exit+9c/a0>
Trace; c01215a4 <dequeue_signal+64/d0>
Trace; c0108c54 <do_signal+1b4/2a0>
Trace; c0113c60 <do_page_fault+0/5a0>
Trace; c0108f38 <signal_return+14/18>

Code;  c01270f6 <zap_page_range+26/b0>
00000000 <_EIP>:
Code;  c01270f6 <zap_page_range+26/b0>   <=====
   0:   0f 0b                     ud2a      <=====
Code;  c01270f8 <zap_page_range+28/b0>
   2:   a3 01 42 4c 1f            mov    %eax,0x1f4c4201
Code;  c01270fd <zap_page_range+2d/b0>
   7:   c0 89 f6 8b 44 24 24      rorb   $0x24,0x24448bf6(%ecx)
Code;  c0127104 <zap_page_range+34/b0>
   e:   89 74 24 04               mov    %esi,0x4(%esp,1)
Code;  c0127108 <zap_page_range+38/b0>
  12:   89 5c 00 00               mov    %ebx,0x0(%eax,%eax,1)

Oct  5 11:46:39 localhost kernel:  kernel BUG at memory.c:419!
Oct  5 11:46:39 localhost kernel: invalid operand: 0000 2.4.20-pre8aa2 #3 Thu 
Oct 3 21:07:54 EST 2002
Oct  5 11:46:39 localhost kernel: CPU:    0
Oct  5 11:46:39 localhost kernel: EIP:    0010:[<c01270f6>]    Not tainted
Oct  5 11:46:39 localhost kernel: EFLAGS: 00210246
Oct  5 11:46:39 localhost kernel: eax: c51a5000   ebx: 00000000   ecx: 
c2ec2a80   edx: 00000000
Oct  5 11:46:39 localhost kernel: esi: c51a5000   edi: 00000000   ebp: 
00000000   esp: d160df48
Oct  5 11:46:39 localhost kernel: ds: 0018   es: 0018   ss: 0018
Oct  5 11:46:39 localhost kernel: Process gnome-session (pid: 5371, 
stackpage=d160d000)
Oct  5 11:46:39 localhost kernel: Stack: c158e380 c013b2cc 00200296 00003000 
dbe63ac0 00000000 c2ec2a80 00000000 
Oct  5 11:46:39 localhost kernel:        c012a5b5 c2ec2a80 00000000 00000000 
00000000 c2ec2a80 d160c000 bffff5fc 
Oct  5 11:46:39 localhost kernel:        00000100 c0116a36 c2ec2a80 00200206 
c2ec2a80 c011b807 c2ec2a80 00001569 
Oct  5 11:46:39 localhost kernel: Call Trace:    [<c013b2cc>] [<c012a5b5>] 
[<c0116a36>] [<c011b807>] [<c011ba13>]
Oct  5 11:46:39 localhost kernel:   [<c0108eff>]
Oct  5 11:46:39 localhost kernel: Code: 0f 0b a3 01 42 4c 1f c0 89 f6 8b 44 24 
24 89 74 24 04 89 5c 


>>EIP; c01270f6 <zap_page_range+26/b0>   <=====

>>eax; c51a5000 <[agpgart].bss.end+10701e5/1b9b265>
>>ecx; c2ec2a80 <[serial].bss.end+8e659d/1ac3b9d>
>>esi; c51a5000 <[agpgart].bss.end+10701e5/1b9b265>
>>esp; d160df48 <END_OF_CODE+afba0f1/????>

Trace; c013b2cc <fput+cc/120>
Trace; c012a5b5 <exit_mmap+b5/130>
Trace; c0116a36 <mmput+56/d0>
Trace; c011b807 <do_exit+87/260>
Trace; c011ba13 <sys_exit+13/20>
Trace; c0108eff <system_call+33/38>

Code;  c01270f6 <zap_page_range+26/b0>
00000000 <_EIP>:
Code;  c01270f6 <zap_page_range+26/b0>   <=====
   0:   0f 0b                     ud2a      <=====
Code;  c01270f8 <zap_page_range+28/b0>
   2:   a3 01 42 4c 1f            mov    %eax,0x1f4c4201
Code;  c01270fd <zap_page_range+2d/b0>
   7:   c0 89 f6 8b 44 24 24      rorb   $0x24,0x24448bf6(%ecx)
Code;  c0127104 <zap_page_range+34/b0>
   e:   89 74 24 04               mov    %esi,0x4(%esp,1)
Code;  c0127108 <zap_page_range+38/b0>
  12:   89 5c 00 00               mov    %ebx,0x0(%eax,%eax,1)


1 warning issued.  Results may not be reliable.

[6.] A small shell script or example program which triggers the
     problem (if possible)
	Unfortunately No.

[7.] Environment
[7.1.] Software (add the output of the ver_linux script here)
If some fields are empty or look unusual you may have an old version.
Compare to the current minimal requirements in Documentation/Changes.

Linux localhost.localdomain 2.4.20-pre8aa2 #3 Thu Oct 3 21:07:54 EST 2002 i686 
athlon i386 GNU/Linux

Gnu C                  gcc (GCC) 3.2 20020903 (Red Hat Linux 8.0 3.2-7) 
Copyright (C) 2002 Free Software Foundation, Inc. This is free software; see 
the source for copying conditions. There is NO warranty; not even for 
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Gnu make               3.79.1
util-linux             2.11r
mount                  2.11r
modutils               2.4.18
e2fsprogs              1.27
pcmcia-cs              3.1.31
PPP                    2.4.1
isdn4k-utils           3.1pre4
Linux C Library        2.2.93
Dynamic linker (ldd)   2.2.93
Procps                 2.0.7
Net-tools              1.60
Kbd                    1.06
Sh-utils               2.0.12
Modules Loaded         ipt_state ip_conntrack ppp_deflate zlib_inflate 
zlib_deflate ppp_async ppp_generic slhc sr_mod emu10k1 ac97_codec soundcore 
radeon agpgart af_packet iptable_filter ip_tables serial floppy ide-scsi 
scsi_mod ide-cd cdrom raid0 md rtc unix

[7.2.] Processor information (from /proc/cpuinfo):
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 6
model           : 4
model name      : AMD Athlon(tm) Processor
stepping        : 2
cpu MHz         : 1200.075
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat 
pse36 mmx fxsr syscall mmxext 3dnowext 3dnow
bogomips        : 2392.06

[7.3.] Module information (from /proc/modules):
ipt_state               1080  36 (autoclean)
ip_conntrack           25152   1 (autoclean) [ipt_state]
ppp_deflate             4472   0 (autoclean)
zlib_inflate           21060   0 (autoclean) [ppp_deflate]
zlib_deflate           20632   0 (autoclean) [ppp_deflate]
ppp_async               9344   1 (autoclean)
ppp_generic            19604   3 (autoclean) [ppp_deflate ppp_async]
slhc                    6832   1 (autoclean) [ppp_generic]
sr_mod                 15960   0 (autoclean)
emu10k1                63488   0 (autoclean)
ac97_codec             13320   0 (autoclean) [emu10k1]
soundcore               5988   4 (autoclean) [emu10k1]
radeon                 87416   3
agpgart                19996   3
af_packet              11464   0 (autoclean)
iptable_filter          2412   1 (autoclean)
ip_tables              14328   2 [ipt_state iptable_filter]
serial                 50404   1 (autoclean)
floppy                 55868   0 (autoclean)
ide-scsi               10512   0
scsi_mod               96788   2 [sr_mod ide-scsi]
ide-cd                 33412   0
cdrom                  32608   0 [sr_mod ide-cd]
raid0                   3912   4 (autoclean)
md                     56544   4 [raid0]
rtc                     8532   0 (autoclean)
unix                   17832 149 (autoclean)

 [7.4.] Loaded driver and hardware information (/proc/ioports, /proc/iomem)
/proc/ioports:
0000-001f : dma1
0020-003f : pic1
0040-005f : timer
0060-006f : keyboard
0070-007f : rtc
0080-008f : dma page reg
00a0-00bf : pic2
00c0-00df : dma2
00f0-00ff : fpu
0170-0177 : ide1
01f0-01f7 : ide0
02f8-02ff : serial(auto)
0376-0376 : ide1
03c0-03df : vga+
03f6-03f6 : ide0
03f8-03ff : serial(auto)
0cf8-0cff : PCI conf1
5000-500f : VIA Technologies, Inc. VT82C686 [Apollo Super ACPI]
6000-607f : VIA Technologies, Inc. VT82C686 [Apollo Super ACPI]
c000-cfff : PCI Bus #01
  c000-c0ff : ATI Technologies Inc Radeon VE QY
d000-d003 : Advanced Micro Devices [AMD] AMD-760 [IGD4-1P] System Controller
d400-d40f : VIA Technologies, Inc. VT82C586B PIPC Bus Master IDE
  d400-d407 : ide0
  d408-d40f : ide1
d800-d81f : VIA Technologies, Inc. USB
dc00-dc1f : VIA Technologies, Inc. USB (#2)
e000-e01f : Creative Labs SB Live! EMU10k1
  e000-e01f : EMU10K1
e400-e407 : Creative Labs SB Live! MIDI/Game Port

/proc/iomem:
0000-001f : dma1
0020-003f : pic1
0040-005f : timer
0060-006f : keyboard
0070-007f : rtc
0080-008f : dma page reg
00a0-00bf : pic2
00c0-00df : dma2
00f0-00ff : fpu
0170-0177 : ide1
01f0-01f7 : ide0
02f8-02ff : serial(auto)
0376-0376 : ide1
03c0-03df : vga+
03f6-03f6 : ide0
03f8-03ff : serial(auto)
0cf8-0cff : PCI conf1
5000-500f : VIA Technologies, Inc. VT82C686 [Apollo Super ACPI]
6000-607f : VIA Technologies, Inc. VT82C686 [Apollo Super ACPI]
c000-cfff : PCI Bus #01
  c000-c0ff : ATI Technologies Inc Radeon VE QY
d000-d003 : Advanced Micro Devices [AMD] AMD-760 [IGD4-1P] System Controller
d400-d40f : VIA Technologies, Inc. VT82C586B PIPC Bus Master IDE
  d400-d407 : ide0
  d408-d40f : ide1
d800-d81f : VIA Technologies, Inc. USB
dc00-dc1f : VIA Technologies, Inc. USB (#2)
e000-e01f : Creative Labs SB Live! EMU10k1
  e000-e01f : EMU10K1
e400-e407 : Creative Labs SB Live! MIDI/Game Port
[hari@localhost linux-2.4.20-pre8]$ cat /proc/iomem
00000000-0009fbff : System RAM
0009fc00-0009ffff : reserved
000a0000-000bffff : Video RAM area
000c0000-000c7fff : Video ROM
000f0000-000fffff : System ROM
00100000-1ffeffff : System RAM
  00100000-001eae83 : Kernel code
  001eae84-0021a73f : Kernel data
1fff0000-1fff2fff : ACPI Non-volatile Storage
1fff3000-1fffffff : ACPI Tables
d0000000-d7ffffff : Advanced Micro Devices [AMD] AMD-760 [IGD4-1P] System 
Controller
d8000000-dfffffff : PCI Bus #01
  d8000000-dfffffff : ATI Technologies Inc Radeon VE QY
e0000000-e1ffffff : PCI Bus #01
  e1000000-e100ffff : ATI Technologies Inc Radeon VE QY
e2000000-e2000fff : Advanced Micro Devices [AMD] AMD-760 [IGD4-1P] System 
Controller
ffff0000-ffffffff : reserved

[7.5.] PCI information ('lspci -vvv' as root)
00:00.0 Host bridge: Advanced Micro Devices [AMD] AMD-760 [IGD4-1P] System 
Controller (rev 12)
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- 
<MAbort+ >SERR- <PERR-
	Latency: 32
	Region 0: Memory at d0000000 (32-bit, prefetchable) [size=128M]
	Region 1: Memory at e2000000 (32-bit, prefetchable) [size=4K]
	Region 2: I/O ports at d000 [disabled] [size=4]
	Capabilities: [a0] AGP version 2.0
		Status: RQ=15 SBA+ 64bit- FW- Rate=x1,x2
		Command: RQ=0 SBA+ AGP+ 64bit- FW- Rate=x1

00:01.0 PCI bridge: Advanced Micro Devices [AMD] AMD-760 [IGD4-1P] AGP Bridge 
(prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR+ FastB2B-
	Status: Cap- 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- 
<MAbort- >SERR- <PERR-
	Latency: 32
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=32
	I/O behind bridge: 0000c000-0000cfff
	Memory behind bridge: e0000000-e1ffffff
	Prefetchable memory behind bridge: d8000000-dfffffff
	BridgeCtl: Parity- SERR+ NoISA+ VGA+ MAbort- >Reset- FastB2B-

00:07.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 
40)
	Subsystem: VIA Technologies, Inc. VT82C686/A PCI to ISA Bridge
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ 
SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- 
<MAbort- >SERR- <PERR-
	Latency: 0
	Capabilities: [c0] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:07.1 IDE interface: VIA Technologies, Inc. VT82C586B PIPC Bus Master IDE 
(rev 06) (prog-if 8a [Master SecP PriP])
	Subsystem: VIA Technologies, Inc. VT82C586B PIPC Bus Master IDE
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- 
<MAbort- >SERR- <PERR-
	Latency: 32
	Region 4: I/O ports at d400 [size=16]
	Capabilities: [c0] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:07.2 USB Controller: VIA Technologies, Inc. USB (rev 16) (prog-if 00 
[UHCI])
	Subsystem: VIA Technologies, Inc. (Wrong ID) USB Controller
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- 
<MAbort- >SERR- <PERR-
	Latency: 32, cache line size 08
	Interrupt: pin D routed to IRQ 10
	Region 4: I/O ports at d800 [size=32]
	Capabilities: [80] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:07.3 USB Controller: VIA Technologies, Inc. USB (rev 16) (prog-if 00 
[UHCI])
	Subsystem: VIA Technologies, Inc. (Wrong ID) USB Controller
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- 
<MAbort- >SERR- <PERR-
	Latency: 32, cache line size 08
	Interrupt: pin D routed to IRQ 10
	Region 4: I/O ports at dc00 [size=32]
	Capabilities: [80] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:07.4 SMBus: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 40)
	Subsystem: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI]
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- 
<MAbort- >SERR- <PERR-
	Interrupt: pin ? routed to IRQ 9
	Capabilities: [68] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:0c.0 Multimedia audio controller: Creative Labs SB Live! EMU10k1 (rev 06)
	Subsystem: Creative Labs CT4832 SBLive! Value
	Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- 
<MAbort- >SERR- <PERR-
	Latency: 32 (500ns min, 5000ns max)
	Interrupt: pin A routed to IRQ 10
	Region 0: I/O ports at e000 [size=32]
	Capabilities: [dc] Power Management version 1
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-

00:0c.1 Input device controller: Creative Labs SB Live! MIDI/Game Port (rev 
06)
	Subsystem: Creative Labs Gameport Joystick
	Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
	Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- 
<MAbort- >SERR- <PERR-
	Latency: 32
	Region 0: I/O ports at e400 [size=8]
	Capabilities: [dc] Power Management version 1
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-

01:05.0 VGA compatible controller: ATI Technologies Inc Radeon VE QY (prog-if 
00 [VGA])
	Subsystem: Unknown device 1787:0202
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping+ 
SERR- FastB2B-
	Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- 
<MAbort- >SERR- <PERR-
	Latency: 32 (2000ns min), cache line size 08
	Interrupt: pin A routed to IRQ 11
	Region 0: Memory at d8000000 (32-bit, prefetchable) [size=128M]
	Region 1: I/O ports at c000 [size=256]
	Region 2: Memory at e1000000 (32-bit, non-prefetchable) [size=64K]
	Expansion ROM at <unassigned> [disabled] [size=128K]
	Capabilities: [58] AGP version 2.0
		Status: RQ=47 SBA+ 64bit- FW- Rate=x1,x2,x4
		Command: RQ=15 SBA+ AGP+ 64bit- FW- Rate=x1
	Capabilities: [50] Power Management version 2
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-

[7.6.] SCSI information (from /proc/scsi/scsi)
Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: RICOH    Model: CD-R/RW MP7083A  Rev: 1.20
  Type:   CD-ROM                           ANSI SCSI revision: 02

[7.7.] Other information that might be relevant to the problem
       (please look in /proc and include all information that you
       think to be relevant):
None.

[X.] Other notes, patches, fixes, workarounds:
	I see the following syslog messages between the oops
Oct  5 11:46:39 localhost gdm(pam_unix)[5359]: session closed for user hari
Oct  5 11:46:39 localhost gdm[5359]: gdm_slave_xioerror_handler: Fatal X error 
- Restarting :0

I was using XFree86 (the one with Red Hat 8) on 2D at the time of oops, no 3D 
activities (the only 3D usage of this computer is playing tuxracer game :)

I was doing heavy file system activities just before the oops, I was trying to 
measure Ext3 and Raid0 performance by creating nearly 5-6 GB file using dd. I 
will see if I can reproduce this on mainline, RH kernel etc.

<rant>
This is the second crash ever happened to me (the first one was the pesky 
netfilter oops may be due to NAT, which didn't make it to the system logs, 
and I am still waiting for it to happen again now that I have kernel 
debugging/sysrq enabled). I am genuinely worried about the stability of my 
favourite OS.
</rant>

Anyway thanks guys, you are all doing a wonderful job on the Linux kernel 
project. Please CC me if you can as I am not subscribed to LKML, but I 
regularly read the web archives.
-- 
Hari
harisri@bigpond.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux-2.4.20-pre8-aa2 oops report.
  2002-10-05  2:47 Linux-2.4.20-pre8-aa2 oops report Srihari Vijayaraghavan
@ 2002-10-05  3:09 ` Srihari Vijayaraghavan
  2002-10-05  7:55   ` Srihari Vijayaraghavan
  0 siblings, 1 reply; 7+ messages in thread
From: Srihari Vijayaraghavan @ 2002-10-05  3:09 UTC (permalink / raw)
  To: linux-kernel

On Saturday 05 October 2002 12:47, Srihari Vijayaraghavan wrote:
> [1.] One line summary of the problem:
> 	2.4.20-pre8aa2 Kernel oopsed couple of times.

A little more research reveals that the oops happens at the following function 
in mm/memory.c

/*
 * remove user pages in a given range.
 */
void zap_page_range(struct mm_struct *mm, unsigned long address, unsigned long 
size)
{
	mmu_gather_t *tlb;
	pgd_t * dir;
	unsigned long start = address, end = address + size;
	int freed = 0;

	dir = pgd_offset(mm, address);

	/*
	 * This is a long-lived spinlock. That's fine.
	 * There's no contention, because the page table
	 * lock only protects against kswapd anyway, and
	 * even if kswapd happened to be looking at this
	 * process we _want_ it to get stuck.
	 */
	if (address >= end)
		BUG();
	spin_lock(&mm->page_table_lock);
	flush_cache_range(mm, address, end);
	tlb = tlb_gather_mmu(mm);

	do {
		freed += zap_pmd_range(tlb, dir, address, end - address);
		address = (address + PGDIR_SIZE) & PGDIR_MASK;
		dir++;
	} while (address && (address < end));

	/* this will flush any remaining tlb entries */
	tlb_finish_mmu(tlb, start, end);

	/*
	 * Update rss for the mm_struct (not necessarily current->mm)
	 * Notice that rss is an unsigned long.
	 */
	if (mm->rss > freed)
		mm->rss -= freed;
	else
		mm->rss = 0;
	spin_unlock(&mm->page_table_lock);
}

BTW I ran memtest2.x and memtest3.0 overnight few times in the past and it 
always passed for more than 30 times or so everytime. I forgot to mention 
this in my previous e-mail.
-- 
Hari
harisri@bigpond.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux-2.4.20-pre8-aa2 oops report.
  2002-10-05  3:09 ` Srihari Vijayaraghavan
@ 2002-10-05  7:55   ` Srihari Vijayaraghavan
  2002-10-10  1:26     ` Linux-2.4.20-pre8-aa2 oops report. [solved] Andrea Arcangeli
  0 siblings, 1 reply; 7+ messages in thread
From: Srihari Vijayaraghavan @ 2002-10-05  7:55 UTC (permalink / raw)
  To: linux-kernel

On Saturday 05 October 2002 13:09, Srihari Vijayaraghavan wrote:
> On Saturday 05 October 2002 12:47, Srihari Vijayaraghavan wrote:
> > [1.] One line summary of the problem:
> > 	2.4.20-pre8aa2 Kernel oopsed couple of times.

I was able to produce couple of more oops.

ksymoops 2.4.5 on i686 2.4.20-pre8aa2.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.20-pre8aa2/ (default)
     -m /boot/System.map-2.4.20-pre8aa2 (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

ac97_codec: AC97 Audio codec, id: v9(SigmaTel STAC9721/23)
Unable to handle kernel paging request at virtual address c5db0034
c0114517
*pde = 05c001e3
Oops: 0000 2.4.20-pre8aa2 #3 Thu Oct 3 21:07:54 EST 2002
CPU:    0
EIP:    0010:[<c0114517>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00013086
eax: 00000000   ebx: c665b324   ecx: c5db0000   edx: c665b324
esi: c665b31c   edi: c01e6ae2   ebp: 00003246   esp: c73b1d90
ds: 0018   es: 0018   ss: 0018
Process modprobe (pid: 1012, stackpage=c73b1000)
Stack: c73b0000 00000002 c66fc000 c73b0000 c0113e82 c01e6ae2 c73b1dfc c73b1f6c 
       c3e27f8e c73b0000 00000000 c17e17c0 c016e94f c3e0cf80 d90e1390 0001ff9d 
       c02102ef 00000000 dffcb5f4 da33f340 dffcb580 c3e0cf80 c016f2d4 d90e1390 
Call Trace:    [<c0113e82>] [<c01e6ae2>] [<c016e94f>] [<c016f2d4>] 
[<c016811f>]
  [<c0113c60>] [<c0108ff0>] [<c01e6ae2>] [<c0127d19>] [<c012860f>] 
[<c0113e0a>]
  [<c0128e88>] [<c013b2cc>] [<c0129e9f>] [<c012a1d2>] [<c012a254>] 
[<c0113c60>]
  [<c0108ff0>]
Code: 8b 51 34 85 d2 74 3f f7 41 14 41 00 00 00 74 36 8b 71 38 89 


>>EIP; c0114517 <search_exception_table+17/80>   <=====

>>ebx; c665b324 <[emu10k1].data.end+88bb25/8a4881>
>>ecx; c5db0000 <[soundcore].bss.end+39889d/3a891d>
>>edx; c665b324 <[emu10k1].data.end+88bb25/8a4881>
>>esi; c665b31c <[emu10k1].data.end+88bb1d/8a4881>
>>edi; c01e6ae2 <fast_clear_page+12/50>
>>ebp; 00003246 Before first symbol
>>esp; c73b1d90 <END_OF_CODE+d39f39/????>

Trace; c0113e82 <do_page_fault+222/5a0>
Trace; c01e6ae2 <fast_clear_page+12/50>
Trace; c016e94f <do_get_write_access+27f/500>
Trace; c016f2d4 <journal_dirty_metadata+174/200>
Trace; c016811f <ext3_do_update_inode+16f/3e0>
Trace; c0113c60 <do_page_fault+0/5a0>
Trace; c0108ff0 <error_code+34/3c>
Trace; c01e6ae2 <fast_clear_page+12/50>
Trace; c0127d19 <do_wp_page+1b9/1f0>
Trace; c012860f <handle_mm_fault+11f/160>
Trace; c0113e0a <do_page_fault+1aa/5a0>
Trace; c0128e88 <zap_pmd_range+78/80>
Trace; c013b2cc <fput+cc/120>
Trace; c0129e9f <unmap_fixup+12f/140>
Trace; c012a1d2 <do_munmap+292/2d0>
Trace; c012a254 <sys_munmap+44/80>
Trace; c0113c60 <do_page_fault+0/5a0>
Trace; c0108ff0 <error_code+34/3c>

Code;  c0114517 <search_exception_table+17/80>
00000000 <_EIP>:
Code;  c0114517 <search_exception_table+17/80>   <=====
   0:   8b 51 34                  mov    0x34(%ecx),%edx   <=====
Code;  c011451a <search_exception_table+1a/80>
   3:   85 d2                     test   %edx,%edx
Code;  c011451c <search_exception_table+1c/80>
   5:   74 3f                     je     46 <_EIP+0x46>
Code;  c011451e <search_exception_table+1e/80>
   7:   f7 41 14 41 00 00 00      testl  $0x41,0x14(%ecx)
Code;  c0114525 <search_exception_table+25/80>
   e:   74 36                     je     46 <_EIP+0x46>
Code;  c0114527 <search_exception_table+27/80>
  10:   8b 71 38                  mov    0x38(%ecx),%esi
Code;  c011452a <search_exception_table+2a/80>
  13:   89 00                     mov    %eax,(%eax)


1 warning issued.  Results may not be reliable.

ksymoops 2.4.5 on i686 2.4.20-pre8aa2.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.20-pre8aa2/ (default)
     -m /boot/System.map-2.4.20-pre8aa2 (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

ac97_codec: AC97 Audio codec, id: v9(SigmaTel STAC9721/23)
Unable to handle kernel paging request at virtual address c2d68358
c014e0f9
*pde = 0823c163
Oops: 0003 2.4.20-pre8aa2 #3 Thu Oct 3 21:07:54 EST 2002
CPU:    0
EIP:    0010:[<c014e0f9>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00210282
eax: c70b7f58   ebx: c70b7f40   ecx: c2d68358   edx: c78bae58
esi: c70b7fac   edi: c687801f   ebp: 4f55e46f   esp: c2545ee0
ds: 0018   es: 0018   ss: 0018
Process pam_timestamp_c (pid: 2001, stackpage=c2545000)
Stack: 00200217 c021097c 0001828e c0120b37 c70b7f40 dff200c0 c6878013 0000000c 
       c6878013 c687801f 00000000 c2545f98 c014486b c70b7ec0 c2545f40 c6878013 
       c0144e94 c70b7ec0 c2545f40 00000000 00000008 00000000 c73f0d00 00000000 
Call Trace:    [<c0120b37>] [<c014486b>] [<c0144e94>] [<c0145377>] 
[<c0145609>]
  [<c014204f>] [<c0108eff>]
Code: 89 11 89 40 04 89 43 18 eb cc 89 f0 89 3c 24 83 c0 3c 89 44 


>>EIP; c014e0f9 <d_lookup+d9/110>   <=====

>>eax; c70b7f58 <END_OF_CODE+9cc101/????>
>>ebx; c70b7f40 <END_OF_CODE+9cc0e9/????>
>>ecx; c2d68358 <[serial].bss.end+76be75/182bb9d>
>>edx; c78bae58 <END_OF_CODE+11cf001/????>
>>esi; c70b7fac <END_OF_CODE+9cc155/????>
>>edi; c687801f <END_OF_CODE+18c1c8/????>
>>ebp; 4f55e46f Before first symbol
>>esp; c2545ee0 <[floppy].bss.end+2184a5/24e645>

Trace; c0120b37 <schedule_timeout+67/b0>
Trace; c014486b <cached_lookup+1b/70>
Trace; c0144e94 <link_path_walk+3c4/6f0>
Trace; c0145377 <path_lookup+37/40>
Trace; c0145609 <__user_walk+49/60>
Trace; c014204f <sys_lstat64+1f/80>
Trace; c0108eff <system_call+33/38>

Code;  c014e0f9 <d_lookup+d9/110>
00000000 <_EIP>:
Code;  c014e0f9 <d_lookup+d9/110>   <=====
   0:   89 11                     mov    %edx,(%ecx)   <=====
Code;  c014e0fb <d_lookup+db/110>
   2:   89 40 04                  mov    %eax,0x4(%eax)
Code;  c014e0fe <d_lookup+de/110>
   5:   89 43 18                  mov    %eax,0x18(%ebx)
Code;  c014e101 <d_lookup+e1/110>
   8:   eb cc                     jmp    ffffffd6 <_EIP+0xffffffd6>
Code;  c014e103 <d_lookup+e3/110>
   a:   89 f0                     mov    %esi,%eax
Code;  c014e105 <d_lookup+e5/110>
   c:   89 3c 24                  mov    %edi,(%esp,1)
Code;  c014e108 <d_lookup+e8/110>
   f:   83 c0 3c                  add    $0x3c,%eax
Code;  c014e10b <d_lookup+eb/110>
  12:   89 44 00 00               mov    %eax,0x0(%eax,%eax,1)


1 warning issued.  Results may not be reliable.

Steps to reproduce:
1. Login to XFree86/KDE or GNOME
2. Start some open-source heavy-weight applications (I use Mozilla, Open 
Office Writer and Calc and Impress)
3. Exit all those applications 
4. # mke2fs -j /dev/md0 (or) mke2fs -j /dev/hdc5
5. # mount /dev/md0 /md0
6. # cd /md0
7. # time dd if=/dev/zero of=zero bs=1024 count=1048576 (I have choosen 1 GB 
file because I have 512 MB RAM in the system)
8. # dmesg (to verify if there is an oops)
9. Repeat step 2 and verify if there is an oops
10. Else repeat steps 1 to 9 couple of times

Intrestingly both mainline (2.4.20-pre8) and Red Hat 8 kernel (2.4.18-14) do 
not exhibit this regression (on few attempts).

Please feel free to suggest any ideas to pinpoint the issue if you can. I will 
test the system with ReiserFS and Debian Woody (gcc 2.95.4) later today or 
tomorrow.
-- 
Hari
harisri@bigpond.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux-2.4.20-pre8-aa2 oops report. [solved]
  2002-10-05  7:55   ` Srihari Vijayaraghavan
@ 2002-10-10  1:26     ` Andrea Arcangeli
  2002-10-10 10:17       ` Srihari Vijayaraghavan
  0 siblings, 1 reply; 7+ messages in thread
From: Andrea Arcangeli @ 2002-10-10  1:26 UTC (permalink / raw)
  To: Srihari Vijayaraghavan; +Cc: linux-kernel

Hello Srihari,

On Sat, Oct 05, 2002 at 05:55:01PM +1000, Srihari Vijayaraghavan wrote:
> On Saturday 05 October 2002 13:09, Srihari Vijayaraghavan wrote:
> > On Saturday 05 October 2002 12:47, Srihari Vijayaraghavan wrote:
> > > [1.] One line summary of the problem:
> > > 	2.4.20-pre8aa2 Kernel oopsed couple of times.
> 
> I was able to produce couple of more oops.

thanks for your detailed reports, please try to reproduce any problem
you had with this incremental fix applied on top of 2.4.20pre8aa2:

--- ul-20021007/kernel/sched.c.~1~	Tue Oct  8 07:14:19 2002
+++ ul-20021007/kernel/sched.c	Thu Oct 10 02:29:58 2002
@@ -380,6 +387,7 @@ void wake_up_forked_process(task_t * p)
 		parent = NULL;
 	}
 
+	p->cpu = smp_processor_id();
 	__activate_task(p, rq, parent);
 	spin_unlock_irq(&rq->lock);
 }


I started to get random reports of corruption after I fixed the
scheduler starvation and resurrected a non weak schedule-child-first
logic in the latest few -aa. It took so long because I really couldn't
see anything wrong in that patch (there wasn't anything wrong indeed).
The new schedule-child-first logic can put the new forked task in the
expired array (to run them just before the parent to maximize cache
effects and to avoid advantaging childs too much by putting them all in
the active array always) and it somehow put at the light a core bug in
the o1 scheduler, this bug is not present in 2.5. I found it after some
day of heavy debugging while trying to find out what was wrong with the
schedule-child-first changes. A task running with a wrong
smp_processor_id() generates very weird oopses and crashes, it is one of
the things that has the most unpredictable side effects. This above
patch should bring back total solidity to my tree. tomorrow I will
release a new -aa with this applied (I may use p->cpu = parent->cpu just
in case it's simpler for the compiler to optimize, but it will be
completely equivalent to the above).

Special thanks to Chris Mason for the help and for finding a way to
reproduce it reliably and even for getting the only reliable single oops
out of it (that I happened to discard because at first glance it looked
corrupt like the others ;)

Other 2.4 backports of the o1 scheduler may want to verify that they
didn't inherit this subtle bug. (I just checked that -ac doesn't have it)

Andrea

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Linux-2.4.20-pre8-aa2 oops report. [solved]
  2002-10-10  1:26     ` Linux-2.4.20-pre8-aa2 oops report. [solved] Andrea Arcangeli
@ 2002-10-10 10:17       ` Srihari Vijayaraghavan
  2002-10-13  1:53         ` 2.4.20-pre10aa1 oops report (was Re: Linux-2.4.20-pre8-aa2 oops report. [solved]) Srihari Vijayaraghavan
  0 siblings, 1 reply; 7+ messages in thread
From: Srihari Vijayaraghavan @ 2002-10-10 10:17 UTC (permalink / raw)
  To: Andrea Arcangeli; +Cc: linux-kernel

Hello Andrea,

> thanks for your detailed reports, please try to reproduce any problem
> you had with this incremental fix applied on top of 2.4.20pre8aa2:
>
> --- ul-20021007/kernel/sched.c.~1~	Tue Oct  8 07:14:19 2002
> +++ ul-20021007/kernel/sched.c	Thu Oct 10 02:29:58 2002
> @@ -380,6 +387,7 @@ void wake_up_forked_process(task_t * p)
>  		parent = NULL;
>  	}
>
> +	p->cpu = smp_processor_id();
>  	__activate_task(p, rq, parent);
>  	spin_unlock_irq(&rq->lock);
>  }
>

Thanks. Unfortunately that did not fix the problem.

I was able to reproduce 4 more oops. (all happened one after other)

ksymoops 2.4.5 on i686 2.4.20-pre8aa2-p1.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.20-pre8aa2-p1/ (default)
     -m /boot/System.map-2.4.20-pre8aa2-p1 (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

Oct 10 19:26:36 localhost kernel: Unable to handle kernel NULL pointer 
dereference at virtual address 0000011b
Oct 10 19:26:36 localhost kernel: c01a96b2
Oct 10 19:26:36 localhost kernel: *pde = 00000000
Oct 10 19:26:36 localhost kernel: Oops: 0000 2.4.20-pre8aa2-p1 #4 Thu Oct 10 
19:12:17 EST 2002
Oct 10 19:26:36 localhost kernel: CPU:    0
Oct 10 19:26:36 localhost kernel: EIP:    0010:[<c01a96b2>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
Oct 10 19:26:36 localhost kernel: EFLAGS: 00010213
Oct 10 19:26:36 localhost kernel: eax: 00000113   ebx: 00000145   ecx: 
c37eff64   edx: c69aedc0
Oct 10 19:26:36 localhost kernel: esi: c5bd4000   edi: c69aedc0   ebp: 
00000000   esp: c37eff1c
Oct 10 19:26:36 localhost kernel: ds: 0018   es: 0018   ss: 0018
Oct 10 19:26:36 localhost kernel: Process bonobo-activati (pid: 988, 
stackpage=c37ef000)
Oct 10 19:26:36 localhost kernel: Stack: c5bd4020 c51daa40 00000004 c014a279 
c69aedc0 00000000 00000000 7fffffff 
Oct 10 19:26:36 localhost kernel:        00000000 00000000 c014a37f 00000005 
c5bd4000 c37eff64 c37eff60 c37ee000 
Oct 10 19:26:36 localhost kernel:        c37ee000 00000000 00000000 c37effa8 
08082fe0 00000000 00000005 c014a4fc 
Oct 10 19:26:36 localhost kernel: Call Trace:    [<c014a279>] [<c014a37f>] 
[<c014a4fc>] [<c0108eff>]
Oct 10 19:26:36 localhost kernel: Code: 8b 48 08 89 44 24 04 89 14 24 8b 44 24 
14 89 44 24 08 ff 51 


>>EIP; c01a96b2 <sock_poll+12/30>   <=====

>>ecx; c37eff64 <[iptable_filter].data.end+12065d9/182e6f5>
>>edx; c69aedc0 <END_OF_CODE+2af69/????>
>>esi; c5bd4000 <[soundcore].bss.end+1d289d/3ae91d>
>>edi; c69aedc0 <END_OF_CODE+2af69/????>
>>esp; c37eff1c <[iptable_filter].data.end+1206591/182e6f5>

Trace; c014a279 <do_pollfd+89/90>
Trace; c014a37f <do_poll+ff/110>
Trace; c014a4fc <sys_poll+16c/2f0>
Trace; c0108eff <system_call+33/38>

Code;  c01a96b2 <sock_poll+12/30>
00000000 <_EIP>:
Code;  c01a96b2 <sock_poll+12/30>   <=====
   0:   8b 48 08                  mov    0x8(%eax),%ecx   <=====
Code;  c01a96b5 <sock_poll+15/30>
   3:   89 44 24 04               mov    %eax,0x4(%esp,1)
Code;  c01a96b9 <sock_poll+19/30>
   7:   89 14 24                  mov    %edx,(%esp,1)
Code;  c01a96bc <sock_poll+1c/30>
   a:   8b 44 24 14               mov    0x14(%esp,1),%eax
Code;  c01a96c0 <sock_poll+20/30>
   e:   89 44 24 08               mov    %eax,0x8(%esp,1)
Code;  c01a96c4 <sock_poll+24/30>
  12:   ff 51 00                  call   *0x0(%ecx)

Oct 10 19:26:36 localhost kernel: CPU:    0
Oct 10 19:26:36 localhost kernel: EIP:    0010:[<c0132998>]    Not tainted
Oct 10 19:26:36 localhost kernel: EFLAGS: 00010057
Oct 10 19:26:36 localhost kernel: eax: ffffffff   ebx: ffffffbf   ecx: 
c4973000   edx: ffffffff
Oct 10 19:26:37 localhost kernel: esi: c15870c0   edi: 00000246   ebp: 
000001f0   esp: c7635f60
Oct 10 19:26:37 localhost kernel: ds: 0018   es: 0018   ss: 0018
Oct 10 19:26:37 localhost kernel: Process gnome-settings- (pid: 992, 
stackpage=c7635000)
Oct 10 19:26:37 localhost kernel: Stack: 00000000 00000000 c7634000 080bdcc8 
080bdcc8 bffff618 c014a657 c15870c0 
Oct 10 19:26:37 localhost kernel:        000001f0 c31a99c0 c3474000 c7634000 
c01150eb c51daac0 c7635fa8 00000000 
Oct 10 19:26:37 localhost kernel:        fffffff4 c013a8f9 00000000 00000000 
c7634000 420d2220 080bdcc8 bffff618 
Oct 10 19:26:37 localhost kernel: Call Trace:    [<c014a657>] [<c01150eb>] 
[<c013a8f9>] [<c0108eff>]
Oct 10 19:26:37 localhost kernel: Code: 89 10 89 42 04 c7 01 00 00 00 00 8b 06 
89 48 04 89 01 89 71 


>>EIP; c0132998 <__kmem_cache_alloc+78/f0>   <=====

>>eax; ffffffff <END_OF_CODE+3967c1a8/????>
>>ebx; ffffffbf <END_OF_CODE+3967c168/????>
>>ecx; c4973000 <[radeon].bss.end+7dda89/186ab09>
>>edx; ffffffff <END_OF_CODE+3967c1a8/????>
>>esi; c15870c0 <_end+12ff710/15786d0>
>>esp; c7635f60 <END_OF_CODE+cb2109/????>

Trace; c014a657 <sys_poll+2c7/2f0>
Trace; c01150eb <do_schedule+15b/240>
Trace; c013a8f9 <sys_writev+69/80>
Trace; c0108eff <system_call+33/38>

Code;  c0132998 <__kmem_cache_alloc+78/f0>
00000000 <_EIP>:
Code;  c0132998 <__kmem_cache_alloc+78/f0>   <=====
   0:   89 10                     mov    %edx,(%eax)   <=====
Code;  c013299a <__kmem_cache_alloc+7a/f0>
   2:   89 42 04                  mov    %eax,0x4(%edx)
Code;  c013299d <__kmem_cache_alloc+7d/f0>
   5:   c7 01 00 00 00 00         movl   $0x0,(%ecx)
Code;  c01329a3 <__kmem_cache_alloc+83/f0>
   b:   8b 06                     mov    (%esi),%eax
Code;  c01329a5 <__kmem_cache_alloc+85/f0>
   d:   89 48 04                  mov    %ecx,0x4(%eax)
Code;  c01329a8 <__kmem_cache_alloc+88/f0>
  10:   89 01                     mov    %eax,(%ecx)
Code;  c01329aa <__kmem_cache_alloc+8a/f0>
  12:   89 71 00                  mov    %esi,0x0(%ecx)

Oct 10 19:26:37 localhost kernel:  <1>Unable to handle kernel NULL pointer 
dereference at virtual address 00000003
Oct 10 19:26:38 localhost kernel: c0131412
Oct 10 19:26:38 localhost kernel: *pde = 00000000
Oct 10 19:26:38 localhost kernel: Oops: 0000 2.4.20-pre8aa2-p1 #4 Thu Oct 10 
19:12:17 EST 2002
Oct 10 19:26:38 localhost kernel: CPU:    0
Oct 10 19:26:38 localhost kernel: EIP:    0010:[<c0131412>]    Not tainted
Oct 10 19:26:38 localhost kernel: EFLAGS: 00010286
Oct 10 19:26:38 localhost kernel: eax: e4cb0000   ebx: ffffffff   ecx: 
c020d768   edx: c497378c
Oct 10 19:26:38 localhost kernel: esi: c7634000   edi: c31a99c0   ebp: 
0000000b   esp: c7635e98
Oct 10 19:26:38 localhost kernel: ds: 0018   es: 0018   ss: 0018
Oct 10 19:26:38 localhost kernel: Process gnome-settings- (pid: 992, 
stackpage=c7635000)
Oct 10 19:26:38 localhost kernel: Stack: c020e600 00000005 c31a99c0 c012a513 
e4cb0000 00000046 00000001 000001f0 
Oct 10 19:26:38 localhost kernel:        c31a99c0 c7634000 c0109a10 0000000b 
c0116a36 c31a99c0 00000202 c31a99c0 
Oct 10 19:26:38 localhost kernel:        c011b807 c31a99c0 00000000 c7635f2c 
00000000 c0109a10 000001f0 c01095ef 
Oct 10 19:26:38 localhost kernel: Call Trace:    [<c012a513>] [<c0109a10>] 
[<c0116a36>] [<c011b807>] [<c0109a10>]
Oct 10 19:26:38 localhost kernel:   [<c01095ef>] [<c0109a61>] [<c0108ff0>] 
[<c0132998>] [<c014a657>] [<c01150eb>]
Oct 10 19:26:38 localhost kernel:   [<c013a8f9>] [<c0108eff>]
Oct 10 19:26:38 localhost kernel: Code: 39 43 04 74 1f 8d 53 0c 8b 5b 0c 85 db 
75 f1 c7 04 24 80 51 


>>EIP; c0131412 <vfree+22/80>   <=====

>>eax; e4cb0000 <END_OF_CODE+1e32c1a9/????>
>>ebx; ffffffff <END_OF_CODE+3967c1a8/????>
>>ecx; c020d768 <gdt_table+68/e0>
>>edx; c497378c <[radeon].bss.end+7de215/186ab09>
>>esi; c7634000 <END_OF_CODE+cb01a9/????>
>>edi; c31a99c0 <[iptable_filter].data.end+bc0035/182e6f5>
>>esp; c7635e98 <END_OF_CODE+cb2041/????>

Trace; c012a513 <exit_mmap+13/130>
Trace; c0109a10 <do_general_protection+0/a0>
Trace; c0116a36 <mmput+56/d0>
Trace; c011b807 <do_exit+87/260>
Trace; c0109a10 <do_general_protection+0/a0>
Trace; c01095ef <die+7f/80>
Trace; c0109a61 <do_general_protection+51/a0>
Trace; c0108ff0 <error_code+34/3c>
Trace; c0132998 <__kmem_cache_alloc+78/f0>
Trace; c014a657 <sys_poll+2c7/2f0>
Trace; c01150eb <do_schedule+15b/240>
Trace; c013a8f9 <sys_writev+69/80>
Trace; c0108eff <system_call+33/38>

Code;  c0131412 <vfree+22/80>
00000000 <_EIP>:
Code;  c0131412 <vfree+22/80>   <=====
   0:   39 43 04                  cmp    %eax,0x4(%ebx)   <=====
Code;  c0131415 <vfree+25/80>
   3:   74 1f                     je     24 <_EIP+0x24>
Code;  c0131417 <vfree+27/80>
   5:   8d 53 0c                  lea    0xc(%ebx),%edx
Code;  c013141a <vfree+2a/80>
   8:   8b 5b 0c                  mov    0xc(%ebx),%ebx
Code;  c013141d <vfree+2d/80>
   b:   85 db                     test   %ebx,%ebx
Code;  c013141f <vfree+2f/80>
   d:   75 f1                     jne    0 <_EIP>
Code;  c0131421 <vfree+31/80>
   f:   c7 04 24 80 51 00 00      movl   $0x5180,(%esp,1)

Oct 10 19:26:38 localhost kernel: CPU:    0
Oct 10 19:26:38 localhost kernel: EIP:    0010:[<c0132998>]    Not tainted
Oct 10 19:26:38 localhost kernel: EFLAGS: 00010057
Oct 10 19:26:38 localhost kernel: eax: ffffffff   ebx: ffffffbf   ecx: 
c4973000   edx: ffffffff
Oct 10 19:26:38 localhost kernel: esi: c15870c0   edi: 00000246   ebp: 
000001f0   esp: c6721f3c
Oct 10 19:26:38 localhost kernel: ds: 0018   es: 0018   ss: 0018
Oct 10 19:26:38 localhost kernel: Process esd (pid: 998, stackpage=c6721000)
Oct 10 19:26:38 localhost kernel: Stack: 7fffffff 00000017 fffffff4 00000001 
c6720000 bffff848 c0149d2c c15870c0 
Oct 10 19:26:38 localhost kernel:        000001f0 c0149e39 00000004 00000004 
c6721f8c 00000005 08054450 bffff8bc 
Oct 10 19:26:38 localhost kernel:        bffff8e8 00000004 00000031 bffff8c0 
00000000 c4973440 c4973444 c4973448 
Oct 10 19:26:38 localhost kernel: Call Trace:    [<c0149d2c>] [<c0149e39>] 
[<c0108eff>]
Oct 10 19:26:38 localhost kernel: Code: 89 10 89 42 04 c7 01 00 00 00 00 8b 06 
89 48 04 89 01 89 71 


>>EIP; c0132998 <__kmem_cache_alloc+78/f0>   <=====

>>eax; ffffffff <END_OF_CODE+3967c1a8/????>
>>ebx; ffffffbf <END_OF_CODE+3967c168/????>
>>ecx; c4973000 <[radeon].bss.end+7dda89/186ab09>
>>edx; ffffffff <END_OF_CODE+3967c1a8/????>
>>esi; c15870c0 <_end+12ff710/15786d0>
>>esp; c6721f3c <[ac97_codec].data.end+92ab35/b88c79>

Trace; c0149d2c <select_bits_alloc+1c/20>
Trace; c0149e39 <sys_select+f9/4b0>
Trace; c0108eff <system_call+33/38>

Code;  c0132998 <__kmem_cache_alloc+78/f0>
00000000 <_EIP>:
Code;  c0132998 <__kmem_cache_alloc+78/f0>   <=====
   0:   89 10                     mov    %edx,(%eax)   <=====
Code;  c013299a <__kmem_cache_alloc+7a/f0>
   2:   89 42 04                  mov    %eax,0x4(%edx)
Code;  c013299d <__kmem_cache_alloc+7d/f0>
   5:   c7 01 00 00 00 00         movl   $0x0,(%ecx)
Code;  c01329a3 <__kmem_cache_alloc+83/f0>
   b:   8b 06                     mov    (%esi),%eax
Code;  c01329a5 <__kmem_cache_alloc+85/f0>
   d:   89 48 04                  mov    %ecx,0x4(%eax)
Code;  c01329a8 <__kmem_cache_alloc+88/f0>
  10:   89 01                     mov    %eax,(%ecx)
Code;  c01329aa <__kmem_cache_alloc+8a/f0>
  12:   89 71 00                  mov    %esi,0x0(%ecx)


1 warning issued.  Results may not be reliable.

I am able to easily reproduce the issue by doing:
1. Login to XFree86/Gnome or KDE
2. Run Mozilla, Open Office Writer/Impress/Calc and exit all of them
3. mke2fs -j /dev/hda9 (that is a blank 2.5 GB partition)
4. mount /dev/hda9 /test
5. cd /test; dd if=/dev/zero of=zero bs=1024 count=1048576
6. Log out and Log in XFree86
7. Oops appears in the System logs

-- 
Hari
harisri@bigpond.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

* 2.4.20-pre10aa1 oops report (was Re: Linux-2.4.20-pre8-aa2 oops report. [solved])
  2002-10-10 10:17       ` Srihari Vijayaraghavan
@ 2002-10-13  1:53         ` Srihari Vijayaraghavan
  2002-10-13 22:42           ` Andrea Arcangeli
  0 siblings, 1 reply; 7+ messages in thread
From: Srihari Vijayaraghavan @ 2002-10-13  1:53 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andrea Arcangeli

Hello,

On Thursday 10 October 2002 20:17, Srihari Vijayaraghavan wrote:
> Thanks. Unfortunately that did not fix the problem.
>
> I was able to reproduce 4 more oops. (all happened one after other)
>
> ksymoops 2.4.5 on i686 2.4.20-pre8aa2-p1.  Options used

Here is a similar oops report from 2.4.20-pre10aa1.

ksymoops 2.4.5 on i686 2.4.20-pre10aa1.  Options used
     -V (default)
     -k /proc/ksyms (default)
     -l /proc/modules (default)
     -o /lib/modules/2.4.20-pre10aa1/ (default)
     -m /boot/System.map-2.4.20-pre10aa1 (default)

Warning: You did not tell me where to find symbol information.  I will
assume that the log matches the kernel and modules that are running
right now and I'll use the default options above for symbol resolution.
If the current kernel and/or modules do not match the log, you can get
more accurate output by telling me the kernel version and where to find
map, modules, ksyms etc.  ksymoops -h explains the options.

Oct 11 22:43:19 localhost kernel: Unable to handle kernel paging request at 
virtual address cbe8e000
Oct 11 22:43:19 localhost kernel: c01e55e2
Oct 11 22:43:19 localhost kernel: *pde = 0bc001e3
Oct 11 22:43:19 localhost kernel: Oops: 0002 2.4.20-pre10aa1 #3 Fri Oct 11 
22:10:08 EST 2002
Oct 11 22:43:19 localhost kernel: CPU:    0
Oct 11 22:43:19 localhost kernel: EIP:    0010:[<c01e55e2>]    Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
Oct 11 22:43:19 localhost kernel: EFLAGS: 00013246
Oct 11 22:43:19 localhost kernel: eax: 0000003f   ebx: cbe8e000   ecx: 
c9f8e000   edx: 00000000
Oct 11 22:43:19 localhost kernel: esi: c3f7d4b0   edi: 000004b0   ebp: 
c120c084   esp: c9f8feac
Oct 11 22:43:19 localhost kernel: ds: 0018   es: 0018   ss: 0018
Oct 11 22:43:19 localhost kernel: Process modprobe (pid: 1675, 
stackpage=c9f8f000)
Oct 11 22:43:19 localhost kernel: Stack: 00104025 c0126952 cbe8e000 c95bc420 
4212c1fc dff87e00 cbc1a140 c0126d7e 
Oct 11 22:43:19 localhost kernel:        dff87e00 cbc1a140 c3f7d4b0 c95bc420 
00000001 4212c1fc c9f8ff24 dff87e00 
Oct 11 22:43:19 localhost kernel:        cbc1a140 4212c1fc c9f8e000 c011240a 
dff87e00 cbc1a140 4212c1fc 00000001 
Oct 11 22:43:19 localhost kernel: Call Trace:    [<c0126952>] [<c0126d7e>] 
[<c011240a>] [<c012869f>] [<c01289d2>]
Oct 11 22:43:19 localhost kernel:   [<c0128a54>] [<c0112260>] [<c01075b0>]
Oct 11 22:43:19 localhost kernel: Code: 0f e7 03 0f e7 43 08 0f e7 43 10 0f e7 
43 18 0f e7 43 20 0f 


>>EIP; c01e55e2 <fast_clear_page+12/50>   <=====

>>ebx; cbe8e000 <[sr_mod].bss.end+54ea1a9/1925c229>
>>ecx; c9f8e000 <[sr_mod].bss.end+35ea1a9/1925c229>
>>esi; c3f7d4b0 <[agpgart].bss.end+200695/1b93265>
>>edi; 000004b0 Before first symbol
>>ebp; c120c084 <_end+f86b14/166cb10>
>>esp; c9f8feac <[sr_mod].bss.end+35ec055/1925c229>

Trace; c0126952 <do_anonymous_page+a2/110>
Trace; c0126d7e <handle_mm_fault+8e/160>
Trace; c011240a <do_page_fault+1aa/5a0>
Trace; c012869f <unmap_fixup+12f/140>
Trace; c01289d2 <do_munmap+292/2d0>
Trace; c0128a54 <sys_munmap+44/80>
Trace; c0112260 <do_page_fault+0/5a0>
Trace; c01075b0 <error_code+34/3c>

Code;  c01e55e2 <fast_clear_page+12/50>
00000000 <_EIP>:
Code;  c01e55e2 <fast_clear_page+12/50>   <=====
   0:   0f e7 03                  movntq %mm0,(%ebx)   <=====
Code;  c01e55e5 <fast_clear_page+15/50>
   3:   0f e7 43 08               movntq %mm0,0x8(%ebx)
Code;  c01e55e9 <fast_clear_page+19/50>
   7:   0f e7 43 10               movntq %mm0,0x10(%ebx)
Code;  c01e55ed <fast_clear_page+1d/50>
   b:   0f e7 43 18               movntq %mm0,0x18(%ebx)
Code;  c01e55f1 <fast_clear_page+21/50>
   f:   0f e7 43 20               movntq %mm0,0x20(%ebx)
Code;  c01e55f5 <fast_clear_page+25/50>
  13:   0f 00 00                  sldtl  (%eax)


1 warning issued.  Results may not be reliable.

The mainline (2.4.20-pre10) does not exhibit this issue. Unlike 
2.4.20-pre8aa1, 2.4.20-pre10aa1 rebooted itself after the above oops.

I am hoping some of these oops might reveal the real issue/reason/bug to 
kernel developers one of these days.

And my sincere thanks for your help.
-- 
Hari
harisri@bigpond.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.4.20-pre10aa1 oops report (was Re: Linux-2.4.20-pre8-aa2 oops report. [solved])
  2002-10-13  1:53         ` 2.4.20-pre10aa1 oops report (was Re: Linux-2.4.20-pre8-aa2 oops report. [solved]) Srihari Vijayaraghavan
@ 2002-10-13 22:42           ` Andrea Arcangeli
  0 siblings, 0 replies; 7+ messages in thread
From: Andrea Arcangeli @ 2002-10-13 22:42 UTC (permalink / raw)
  To: Srihari Vijayaraghavan; +Cc: linux-kernel

On Sun, Oct 13, 2002 at 11:53:29AM +1000, Srihari Vijayaraghavan wrote:
> Oct 11 22:43:19 localhost kernel: Process modprobe (pid: 1675, 

this smells like a problem with one of your modules. Please make 100%
sure you use exactly the same .config for both 2.4.20pre10 and
2.4.20pre10aa1 and please try to find which is the module that is
crashing the kernel after it's being loaded. Expect always different
kind of crashes and oopses. You can also try to turn on the slab
debugging option in the kernel hacking menu.

> Code;  c01e55e2 <fast_clear_page+12/50>

you also may want to configure the kernel as i686 instead of K7 so
fast_clear_page won't be used to see if it makes any difference.

> The mainline (2.4.20-pre10) does not exhibit this issue. Unlike 
> 2.4.20-pre8aa1, 2.4.20-pre10aa1 rebooted itself after the above oops.
> 
> I am hoping some of these oops might reveal the real issue/reason/bug to 
> kernel developers one of these days.

the place where the oops happens is most certainly not the problem,
either something is wrong with fast_clear_page for whatever hardware
reason, or more likely the moduled by modprobe is corrupting the
freelist and alloc_pages returned garbage.

btw, how much memory do you have? If you've more than 800M it could be a
broken driver using pte_offset by hand, try to reproduce with mem=800m
in such case. To fix this you should find which is the module that is
destabilizing the kernel.

thanks for the reports.

Andrea

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2002-10-13 22:38 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-10-05  2:47 Linux-2.4.20-pre8-aa2 oops report Srihari Vijayaraghavan
2002-10-05  3:09 ` Srihari Vijayaraghavan
2002-10-05  7:55   ` Srihari Vijayaraghavan
2002-10-10  1:26     ` Linux-2.4.20-pre8-aa2 oops report. [solved] Andrea Arcangeli
2002-10-10 10:17       ` Srihari Vijayaraghavan
2002-10-13  1:53         ` 2.4.20-pre10aa1 oops report (was Re: Linux-2.4.20-pre8-aa2 oops report. [solved]) Srihari Vijayaraghavan
2002-10-13 22:42           ` Andrea Arcangeli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).