* 2.6.0-test7 oops in proc_pid_stat
@ 2003-10-09 13:04 Olaf Hering
2003-10-09 22:04 ` Linus Torvalds
0 siblings, 1 reply; 18+ messages in thread
From: Olaf Hering @ 2003-10-09 13:04 UTC (permalink / raw)
To: linux-kernel
IBM blade server, 2 cpus (Intel(R) XEON(TM) CPU 2.40GHz), 512mb.
Linux version 2.6.0-test7 (olaf@zert152) (gcc version 3.2.2) #2 SMP Thu Oct 9 08:49:29 CEST 2003
Unable to handle kernel NULL pointer dereference at virtual address 0000003c
printing eip:
c018a322
*pde = 00000000
Oops: 0000 [#1]
CPU: 0
EIP: 0060:[<c018a322>] Not tainted
EFLAGS: 00010246
EIP is at proc_pid_stat+0x92/0x510
eax: 00000000 ebx: df2b0d80 ecx: 00000000 edx: c038afcc
esi: 00000000 edi: df2b0d80 ebp: 00000000 esp: ce85de3c
ds: 007b es: 007b ss: 0068
Process pstree (pid: 3518, threadinfo=ce85c000 task=dbb38c80)
Stack: df94b900 c034f440 00000dad df6b5bda 00000053 00000d99 00000419 00000419
0000040d 00000419 00000100 00000086 000000e0 00000106 00000284 00000000
cf6419b4 cf641940 ce136006 c0187ce8 df2b0d80 cf641940 ce85df38 dffd3820
Call Trace:
[<c0187ce8>] pid_revalidate+0x28/0xd0
[<c0170300>] dput+0x30/0x1b0
[<c0140ac3>] buffered_rmqueue+0xc3/0x150
[<c0140c00>] __alloc_pages+0xb0/0x350
[<c0187174>] proc_info_read+0x74/0x160
[<c015904e>] vfs_read+0xbe/0x130
[<c01592f2>] sys_read+0x42/0x70
[<c010b52f>] syscall_call+0x7/0xb
Code: 8b 48 3c 85 c9 74 40 8b 81 98 00 00 00 89 84 24 d4 00 00 00
config is all static. I was reading a CD in the foreground and 2 rpm
builds in the background.
...
screen -S cdtest -- sh -c 'for i in `seq 0 420` `seq 0 420` ; do date; umount -v /media/cdrom ; mount -v /media/cdrom ; find /media/cdrom -type f -print0 | xargs -0 --verbose -n1 cat > /dev/null || break ; done &>log'
...
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_STANDALONE=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_SYSCTL=y
CONFIG_LOG_BUF_SHIFT=18
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_KALLSYMS=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_X86_PC=y
CONFIG_MPENTIUMIII=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=5
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_SMP=y
CONFIG_NR_CPUS=4
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
CONFIG_NOHIGHMEM=y
CONFIG_MTRR=y
CONFIG_HAVE_DEC_LOCK=y
CONFIG_PM=y
CONFIG_ACPI=y
CONFIG_ACPI_BOOT=y
CONFIG_ACPI_INTERPRETER=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_SLEEP_PROC_FS=y
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_FAN=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_THERMAL=y
CONFIG_ACPI_ASUS=y
CONFIG_ACPI_TOSHIBA=y
CONFIG_ACPI_BUS=y
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_PCI=y
CONFIG_ACPI_SYSTEM=y
CONFIG_ACPI_RELAXED_AML=y
CONFIG_APM=y
CONFIG_APM_DO_ENABLE=y
CONFIG_APM_DISPLAY_BLANK=y
CONFIG_APM_ALLOW_INTS=y
CONFIG_PCI=y
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_HOTPLUG=y
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_MISC=y
CONFIG_FW_LOADER=y
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=64000
CONFIG_BLK_DEV_INITRD=y
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
CONFIG_IDEDISK_STROKE=y
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
CONFIG_BLK_DEV_OFFBOARD=y
CONFIG_BLK_DEV_GENERIC=y
CONFIG_BLK_DEV_IDEDMA_PCI=y
CONFIG_IDEDMA_PCI_AUTO=y
CONFIG_IDEDMA_ONLYDISK=y
CONFIG_BLK_DEV_ADMA=y
CONFIG_BLK_DEV_SVWKS=y
CONFIG_BLK_DEV_IDEDMA=y
CONFIG_IDEDMA_AUTO=y
CONFIG_SCSI=y
CONFIG_SCSI_PROC_FS=y
CONFIG_BLK_DEV_SR=y
CONFIG_CHR_DEV_SG=y
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_REPORT_LUNS=y
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y
CONFIG_NET=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_INET_ECN=y
CONFIG_SYN_COOKIES=y
CONFIG_IPV6_SCTP__=y
CONFIG_NETDEVICES=y
CONFIG_TIGON3=y
CONFIG_INPUT=y
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_INPUT_EVDEV=y
CONFIG_SOUND_GAMEPORT=y
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_NR_UARTS=4
CONFIG_SERIAL_CORE=y
CONFIG_UNIX98_PTYS=y
CONFIG_UNIX98_PTY_COUNT=256
CONFIG_NVRAM=y
CONFIG_RTC=y
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
CONFIG_USB=y
CONFIG_USB_DEVICEFS=y
CONFIG_USB_OHCI_HCD=y
CONFIG_USB_STORAGE=y
CONFIG_USB_STORAGE_DATAFAB=y
CONFIG_USB_STORAGE_FREECOM=y
CONFIG_USB_STORAGE_ISD200=y
CONFIG_USB_STORAGE_DPCM=y
CONFIG_USB_STORAGE_HP8200e=y
CONFIG_USB_STORAGE_SDDR09=y
CONFIG_USB_STORAGE_SDDR55=y
CONFIG_USB_STORAGE_JUMPSHOT=y
CONFIG_USB_HID=y
CONFIG_USB_HIDINPUT=y
CONFIG_USB_HIDDEV=y
CONFIG_EXT2_FS=y
CONFIG_EXT2_FS_XATTR=y
CONFIG_EXT2_FS_POSIX_ACL=y
CONFIG_EXT3_FS=y
CONFIG_EXT3_FS_XATTR=y
CONFIG_EXT3_FS_POSIX_ACL=y
CONFIG_JBD=y
CONFIG_JBD_DEBUG=y
CONFIG_FS_MBCACHE=y
CONFIG_REISERFS_FS=y
CONFIG_FS_POSIX_ACL=y
CONFIG_MINIX_FS=y
CONFIG_AUTOFS_FS=y
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_ZISOFS_FS=y
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_DEVPTS_FS=y
CONFIG_TMPFS=y
CONFIG_RAMFS=y
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_SUNRPC=y
CONFIG_MSDOS_PARTITION=y
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_DEBUG_KERNEL=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_X86_EXTRA_IRQS=y
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y
CONFIG_CRC32=y
CONFIG_ZLIB_INFLATE=y
CONFIG_X86_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_X86_TRAMPOLINE=y
CONFIG_PC=y
--
USB is for mice, FireWire is for men!
sUse lINUX ag, nÜRNBERG
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.6.0-test7 oops in proc_pid_stat
2003-10-09 13:04 2.6.0-test7 oops in proc_pid_stat Olaf Hering
@ 2003-10-09 22:04 ` Linus Torvalds
2003-10-10 6:50 ` 2.6.0-test7 DEBUG_PAGEALLOC oops Mike Galbraith
2003-10-10 7:28 ` 2.6.0-test7 oops in proc_pid_stat Olaf Hering
0 siblings, 2 replies; 18+ messages in thread
From: Linus Torvalds @ 2003-10-09 22:04 UTC (permalink / raw)
To: Olaf Hering, linux-kernel, Taner Halicioglu
Olaf Hering wrote:
>
> Linux version 2.6.0-test7 (olaf@zert152) (gcc version 3.2.2) #2 SMP Thu
> Oct 9 08:49:29 CEST 2003
>
> Unable to handle kernel NULL pointer dereference at virtual address
0000003c
Ok, this seems to be due to the move of the job control fields from
the task structure to the signal structure.
That looks like a bad idea, and the best thing to do is likely to just
revert the whole thing.
If you are a BK user, do a "bk changes" to find the ChangeSet that says
"[PATCH] move job control fields from task_struct to", and just do a
bk cset -xX.XXXX
where X.XXXX is the changeset number in your tree (that will depend on
exactly what else is in your tree).
Linus
^ permalink raw reply [flat|nested] 18+ messages in thread
* 2.6.0-test7 DEBUG_PAGEALLOC oops
2003-10-09 22:04 ` Linus Torvalds
@ 2003-10-10 6:50 ` Mike Galbraith
2003-10-10 16:52 ` Zwane Mwaikambo
2003-10-10 7:28 ` 2.6.0-test7 oops in proc_pid_stat Olaf Hering
1 sibling, 1 reply; 18+ messages in thread
From: Mike Galbraith @ 2003-10-10 6:50 UTC (permalink / raw)
To: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 93 bytes --]
Greetings,
Enabling page allocation debugging produced the attached repeatable oops.
-Mike
[-- Attachment #2: Type: text/plain, Size: 6798 bytes --]
Linux version 2.6.0-test7 (root@mikeg) (gcc version gcc-2.95.3 20010315 (release)) #118 Fri Oct 10 08:20:35 CEST 2003
Video mode to be used for restore is f00
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 0000000007ff0000 (usable)
BIOS-e820: 0000000007ff0000 - 0000000007ff3000 (ACPI NVS)
BIOS-e820: 0000000007ff3000 - 0000000008000000 (ACPI data)
BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
127MB LOWMEM available.
On node 0 totalpages: 32752
DMA zone: 4096 pages, LIFO batch:1
Normal zone: 28656 pages, LIFO batch:6
HighMem zone: 0 pages, LIFO batch:1
DMI 2.2 present.
Building zonelist for node : 0
Kernel command line: root=/dev/hda6 ro console=ttyS0,115200n8 console=tty0 apm=power-off elevator=as sb=220,5,0,6 mpu401=0x BOOT_IMAGE=260t7vir
Local APIC disabled by BIOS -- reenabling.
Found and enabled local APIC!
Initializing CPU#0
PID hash table entries: 512 (order 9: 4096 bytes)
Detected 499.509 MHz processor.
Console: colour VGA+ 80x25
Memory: 125276k/131008k available (1668k kernel code, 5196k reserved, 663k data, 296k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay loop... 985.08 BogoMIPS
Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 512K
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: Intel Pentium III (Katmai) stepping 03
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
enabled ExtINT on CPU#0
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
Using local APIC timer interrupts.
calibrating APIC timer ...
..... CPU clock speed is 499.0050 MHz.
..... host bus clock speed is 99.0809 MHz.
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xfb1a0, last bus=1
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
PCI: Using IRQ router VIA [1106/0596] at 0000:00:07.0
apm: BIOS version 1.2 Flags 0x07 (Driver version 1.16ac)
VFS: Disk quotas dquot_6.5.1
Activating ISA DMA hang workarounds.
pty: 256 Unix98 ptys configured
Real Time Clock Driver v1.12
Linux agpgart interface v0.100 (c) Dave Jones
agpgart: Detected VIA Apollo Pro 133 chipset
agpgart: Maximum main memory to use for agp memory: 94M
agpgart: AGP aperture is 64M @ 0xe0000000
[drm] Initialized r128 2.5.0 20030725 on minor 0
Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
Linux Tulip driver version 1.1.13 (May 11, 2002)
PCI: Found IRQ 9 for device 0000:00:0f.0
PCI: Sharing IRQ 9 with 0000:00:0c.0
eth0: ADMtek Comet rev 17 at 0xc8823000, 00:04:5A:64:94:34, IRQ 9.
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller at PCI slot 0000:00:07.1
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: VIA vt82c596b (rev 11) IDE UDMA66 controller on pci0000:00:07.1
ide0: BM-DMA at 0xe000-0xe007, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xe008-0xe00f, BIOS settings: hdc:DMA, hdd:pio
hda: IBM-DJNA-352030, ATA DISK drive
Using anticipatory io scheduler
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hdc: MATSHITADVD-ROM SR-8583A, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
hda: max request size: 128KiB
hda: 39876480 sectors (20416 MB) w/1966KiB Cache, CHS=39560/16/63
hda: hda1 hda2 hda3 < hda5 hda6 hda7 hda8 >
end_request: I/O error, dev hdc, sector 0
hdc: ATAPI 32X DVD-ROM drive, 512kB Cache, DMA
Uniform CD-ROM driver Revision: 3.12
mice: PS/2 mouse device common for all mice
input: PS/2 Generic Mouse on isa0060/serio1
serio: i8042 AUX port at 0x60,0x64 irq 12
input: AT Translated Set 2 keyboard on isa0060/serio0
serio: i8042 KBD port at 0x60,0x64 irq 1
i2c /dev entries driver
Advanced Linux Sound Architecture Driver Version 0.9.7 (Thu Sep 25 19:16:36 2003 UTC).
PCI: Found IRQ 9 for device 0000:00:0c.0
PCI: Sharing IRQ 9 with 0000:00:0f.0
ALSA device list:
#0: Yamaha DS-XG PCI (YMF740C) at 0xec000000, irq 9
NET: Registered protocol family 2
IP: routing cache hash table of 512 buckets, 4Kbytes
TCP: Hash tables configured (established 8192 bind 8192)
NET: Registered protocol family 1
NET: Registered protocol family 17
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 296k freed
Adding 265064k swap on /dev/hda2. Priority:2 extents:1
blk: queue c7af4df8, I/O limit 4095Mb (mask 0xffffffff)
EXT3 FS on hda6, internal journal
kjournald starting. Commit interval 5 seconds
EXT3 FS on hda7, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on hda8, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Unable to handle kernel paging request at virtual address c034a000
printing eip:
c0134d5a
*pde = 00102027
*pte = 0034a000
Oops: 0000 [#1]
CPU: 0
EIP: 0060:[<c0134d5a>] Not tainted
EFLAGS: 00010002
EIP is at store_stackinfo+0x4e/0x80
eax: 00000000 ebx: c7802f98 ecx: c0301390 edx: c030138c
esi: c0349ffe edi: 017e0008 ebp: c0349da6 esp: c0349d96
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=c0348000 task=c02fcbe0)
Stack: c78371dc c11731d8 c7802000 00000060 c0349dd6 c0136aa8 c11731d8 c7802000
0000006b c7802f58 c7af4df8 c7fec428 00001000 c0131b6c c7d2ef78 00000086
c0349de6 c0131b6c c11731d8 c7802f58 c0349e02 c0131b3e c7802f58 c11731d8
Call Trace:
[<c0136aa8>] kmem_cache_free+0x218/0x294
[<c0131b6c>] mempool_free_slab+0x10/0x14
[<c0131b6c>] mempool_free_slab+0x10/0x14
[<c0131b3e>] mempool_free+0x7a/0x84
[<c01e3984>] __blk_put_request+0x74/0x88
[<c01e4716>] end_that_request_last+0x62/0x7c
[<c01f0be3>] ide_end_request+0xf3/0x124
[<c01f9f68>] default_end_request+0x14/0x18
[<c0201df8>] ide_dma_intr+0x60/0x98
[<c01f2024>] ide_intr+0x108/0x17c
[<c0201d98>] ide_dma_intr+0x0/0x98
[<c010a423>] handle_IRQ_event+0x2b/0x58
[<c010a70e>] do_IRQ+0x92/0x130
[<c0109014>] common_interrupt+0x18/0x20
[<c010a423>]
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.6.0-test7 oops in proc_pid_stat
2003-10-09 22:04 ` Linus Torvalds
2003-10-10 6:50 ` 2.6.0-test7 DEBUG_PAGEALLOC oops Mike Galbraith
@ 2003-10-10 7:28 ` Olaf Hering
1 sibling, 0 replies; 18+ messages in thread
From: Olaf Hering @ 2003-10-10 7:28 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel, Taner Halicioglu
On Thu, Oct 09, Linus Torvalds wrote:
> Olaf Hering wrote:
> >
> > Linux version 2.6.0-test7 (olaf@zert152) (gcc version 3.2.2) #2 SMP Thu
> > Oct 9 08:49:29 CEST 2003
> >
> > Unable to handle kernel NULL pointer dereference at virtual address
> 0000003c
>
> Ok, this seems to be due to the move of the job control fields from
> the task structure to the signal structure.
>
> That looks like a bad idea, and the best thing to do is likely to just
> revert the whole thing.
I have reverted these two patches and fixed the reject in sched.h, no
oops since 9 hours.
# 03/10/05 akpm@osdl.org[torvalds] 1.1451.4.16
# [PATCH] move job control fields from task_struct to
# 03/10/05 akpm@osdl.org[torvalds] 1.1451.4.17
# [PATCH] fix "compat ioctl consolidation" for "move job
--
USB is for mice, FireWire is for men!
sUse lINUX ag, nÜRNBERG
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.6.0-test7 DEBUG_PAGEALLOC oops
2003-10-10 6:50 ` 2.6.0-test7 DEBUG_PAGEALLOC oops Mike Galbraith
@ 2003-10-10 16:52 ` Zwane Mwaikambo
2003-10-11 7:01 ` Mike Galbraith
0 siblings, 1 reply; 18+ messages in thread
From: Zwane Mwaikambo @ 2003-10-10 16:52 UTC (permalink / raw)
To: Mike Galbraith; +Cc: linux-kernel
On Fri, 10 Oct 2003, Mike Galbraith wrote:
> Greetings,
>
> Enabling page allocation debugging produced the attached repeatable oops.
There is an open bugzilla for this, i'd appreciate it if you could follow
up there.
Thanks
http://bugzilla.kernel.org/show_bug.cgi?id=973
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.6.0-test7 DEBUG_PAGEALLOC oops
2003-10-10 16:52 ` Zwane Mwaikambo
@ 2003-10-11 7:01 ` Mike Galbraith
2003-10-11 7:03 ` Zwane Mwaikambo
0 siblings, 1 reply; 18+ messages in thread
From: Mike Galbraith @ 2003-10-11 7:01 UTC (permalink / raw)
To: Zwane Mwaikambo; +Cc: linux-kernel
At 12:52 PM 10/10/2003 -0400, Zwane Mwaikambo wrote:
>On Fri, 10 Oct 2003, Mike Galbraith wrote:
>
> > Greetings,
> >
> > Enabling page allocation debugging produced the attached repeatable oops.
>
>There is an open bugzilla for this, i'd appreciate it if you could follow
>up there.
>
>Thanks
>
>http://bugzilla.kernel.org/show_bug.cgi?id=973
403.
(i'll go poke around the source instead)
-Mike
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.6.0-test7 DEBUG_PAGEALLOC oops
2003-10-11 7:01 ` Mike Galbraith
@ 2003-10-11 7:03 ` Zwane Mwaikambo
0 siblings, 0 replies; 18+ messages in thread
From: Zwane Mwaikambo @ 2003-10-11 7:03 UTC (permalink / raw)
To: Mike Galbraith; +Cc: linux-kernel
On Sat, 11 Oct 2003, Mike Galbraith wrote:
> At 12:52 PM 10/10/2003 -0400, Zwane Mwaikambo wrote:
> >http://bugzilla.kernel.org/show_bug.cgi?id=973
>
> 403.
>
> (i'll go poke around the source instead)
Sorry, someone broke the kernel.org bugzilla URL, but you can still access
it via;
http://bugme.osdl.org/show_bug.cgi?id=973
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.6.0-test7 DEBUG_PAGEALLOC oops
2003-10-12 6:58 ` Manfred Spraul
2003-10-12 8:52 ` Mike Galbraith
@ 2003-10-12 22:36 ` Thomas Molina
1 sibling, 0 replies; 18+ messages in thread
From: Thomas Molina @ 2003-10-12 22:36 UTC (permalink / raw)
To: Manfred Spraul; +Cc: Mike Galbraith, Zwane Mwaikambo, linux-kernel
[-- Attachment #1: Type: TEXT/PLAIN, Size: 361 bytes --]
On Sun, 12 Oct 2003, Manfred Spraul wrote:
> Could you try the attached patch?
> It updates the end of stack detection to handle unaligned stacks.
I've attached a rediff of your patch against test7 bitkeeper. It has the
kstack_end function moved up above ASSEMBLY as suggested by Mike. I've
tested this version and it works for me (tm). Thanks a bunch.
[-- Attachment #2: Type: TEXT/PLAIN, Size: 2004 bytes --]
diff -ur linux-2.5-tma/arch/i386/kernel/traps.c linux-2.5-tm/arch/i386/kernel/traps.c
--- linux-2.5-tma/arch/i386/kernel/traps.c 2003-10-12 16:16:56.000000000 -0400
+++ linux-2.5-tm/arch/i386/kernel/traps.c 2003-10-12 15:05:48.000000000 -0400
@@ -104,7 +104,7 @@
#ifdef CONFIG_KALLSYMS
printk("\n");
#endif
- while (((long) stack & (THREAD_SIZE-1)) != 0) {
+ while (!kstack_end(stack)) {
addr = *stack++;
if (kernel_text_address(addr)) {
printk(" [<%08lx>] ", addr);
@@ -138,7 +138,7 @@
stack = esp;
for(i = 0; i < kstack_depth_to_print; i++) {
- if (((long) stack & (THREAD_SIZE-1)) == 0)
+ if (kstack_end(stack))
break;
if (i && ((i % 8) == 0))
printk("\n ");
diff -ur linux-2.5-tma/include/asm-i386/thread_info.h linux-2.5-tm/include/asm-i386/thread_info.h
--- linux-2.5-tma/include/asm-i386/thread_info.h 2003-10-12 16:16:28.000000000 -0400
+++ linux-2.5-tm/include/asm-i386/thread_info.h 2003-10-12 15:19:09.000000000 -0400
@@ -92,6 +92,16 @@
#define get_thread_info(ti) get_task_struct((ti)->task)
#define put_thread_info(ti) put_task_struct((ti)->task)
+static inline int kstack_end(void *addr)
+{
+ unsigned long offset = (unsigned long)addr & (THREAD_SIZE-1);
+
+ /* Some APM bios versions misalign the stack */
+ if (offset == 0 || offset > (THREAD_SIZE-sizeof(void*)))
+ return 1;
+ return 0;
+}
+
#else /* !__ASSEMBLY__ */
/* how to get the thread information struct from ASM */
diff -ur linux-2.5-tma/mm/slab.c linux-2.5-tm/mm/slab.c
--- linux-2.5-tma/mm/slab.c 2003-10-12 16:24:14.000000000 -0400
+++ linux-2.5-tm/mm/slab.c 2003-10-12 15:05:48.000000000 -0400
@@ -862,7 +862,7 @@
unsigned long *sptr = &caller;
unsigned long svalue;
- while (((long) sptr & (THREAD_SIZE-1)) != 0) {
+ while (!kstack_end(sptr)) {
svalue = *sptr++;
if (kernel_text_address(svalue)) {
*addr++=svalue;
Only in linux-2.5-tm: stack.patch
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.6.0-test7 DEBUG_PAGEALLOC oops
2003-10-12 12:08 ` Thomas Molina
@ 2003-10-12 14:13 ` Thomas Molina
0 siblings, 0 replies; 18+ messages in thread
From: Thomas Molina @ 2003-10-12 14:13 UTC (permalink / raw)
To: Mike Galbraith; +Cc: Manfred Spraul, Zwane Mwaikambo, linux-kernel
On Sun, 12 Oct 2003, Thomas Molina wrote:
> On Sun, 12 Oct 2003, Mike Galbraith wrote:
>
> > At 08:58 AM 10/12/2003 +0200, Manfred Spraul wrote:
> > >Could you try the attached patch?
> > >It updates the end of stack detection to handle unaligned stacks.
> >
> > Works fine. (modulo moving kstack_end above ASSEMBLY)
>
> I'm the one with bugzilla 973. I'm trying the patch with a source tree
> synced up from bk this morning and having a few problems. My in-laws are
> visiting today, so my work on this will be intermittent. I am interested,
> however.
Note to self: next time read the whole message, including the part in
parenthesis. The patch, modulo Mike's modulo (move the function where I
was told to move the function), does indeed work fine.
Testing continues, but thanks!
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.6.0-test7 DEBUG_PAGEALLOC oops
2003-10-12 8:52 ` Mike Galbraith
@ 2003-10-12 12:08 ` Thomas Molina
2003-10-12 14:13 ` Thomas Molina
0 siblings, 1 reply; 18+ messages in thread
From: Thomas Molina @ 2003-10-12 12:08 UTC (permalink / raw)
To: Mike Galbraith; +Cc: Manfred Spraul, Zwane Mwaikambo, linux-kernel
On Sun, 12 Oct 2003, Mike Galbraith wrote:
> At 08:58 AM 10/12/2003 +0200, Manfred Spraul wrote:
> >Could you try the attached patch?
> >It updates the end of stack detection to handle unaligned stacks.
>
> Works fine. (modulo moving kstack_end above ASSEMBLY)
I'm the one with bugzilla 973. I'm trying the patch with a source tree
synced up from bk this morning and having a few problems. My in-laws are
visiting today, so my work on this will be intermittent. I am interested,
however.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.6.0-test7 DEBUG_PAGEALLOC oops
2003-10-12 6:58 ` Manfred Spraul
@ 2003-10-12 8:52 ` Mike Galbraith
2003-10-12 12:08 ` Thomas Molina
2003-10-12 22:36 ` Thomas Molina
1 sibling, 1 reply; 18+ messages in thread
From: Mike Galbraith @ 2003-10-12 8:52 UTC (permalink / raw)
To: Manfred Spraul; +Cc: Zwane Mwaikambo, linux-kernel
At 08:58 AM 10/12/2003 +0200, Manfred Spraul wrote:
>Could you try the attached patch?
>It updates the end of stack detection to handle unaligned stacks.
Works fine. (modulo moving kstack_end above ASSEMBLY)
Thanks,
-Mike
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.6.0-test7 DEBUG_PAGEALLOC oops
2003-10-12 5:11 ` Mike Galbraith
@ 2003-10-12 6:58 ` Manfred Spraul
2003-10-12 8:52 ` Mike Galbraith
2003-10-12 22:36 ` Thomas Molina
0 siblings, 2 replies; 18+ messages in thread
From: Manfred Spraul @ 2003-10-12 6:58 UTC (permalink / raw)
To: Mike Galbraith; +Cc: Zwane Mwaikambo, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 116 bytes --]
Could you try the attached patch?
It updates the end of stack detection to handle unaligned stacks.
--
Manfred
[-- Attachment #2: patch-end-of-stack --]
[-- Type: text/plain, Size: 1625 bytes --]
// $Header$
// Kernel Version:
// VERSION = 2
// PATCHLEVEL = 6
// SUBLEVEL = 0
// EXTRAVERSION = -test7
--- 2.6/include/asm-i386/thread_info.h 2003-10-09 21:20:00.000000000 +0200
+++ build-2.6/include/asm-i386/thread_info.h 2003-10-12 08:50:12.000000000 +0200
@@ -101,6 +101,16 @@
#endif
+static inline int kstack_end(void *addr)
+{
+ unsigned long offset = (unsigned long)addr & (THREAD_SIZE-1);
+
+ /* Some APM bios versions misalign the stack */
+ if (offset == 0 || offset > (THREAD_SIZE-sizeof(void*)))
+ return 1;
+ return 0;
+}
+
/*
* thread information flags
* - these are process state flags that various assembly files may need to access
--- 2.6/mm/slab.c 2003-10-09 21:23:19.000000000 +0200
+++ build-2.6/mm/slab.c 2003-10-12 08:51:13.000000000 +0200
@@ -862,7 +862,7 @@
unsigned long *sptr = &caller;
unsigned long svalue;
- while (((long) sptr & (THREAD_SIZE-1)) != 0) {
+ while (!kstack_end(sptr)) {
svalue = *sptr++;
if (kernel_text_address(svalue)) {
*addr++=svalue;
--- 2.6/arch/i386/kernel/traps.c 2003-10-09 21:23:03.000000000 +0200
+++ build-2.6/arch/i386/kernel/traps.c 2003-10-12 08:50:41.000000000 +0200
@@ -104,7 +104,7 @@
#ifdef CONFIG_KALLSYMS
printk("\n");
#endif
- while (((long) stack & (THREAD_SIZE-1)) != 0) {
+ while (!kstack_end(stack)) {
addr = *stack++;
if (kernel_text_address(addr)) {
printk(" [<%08lx>] ", addr);
@@ -138,7 +138,7 @@
stack = esp;
for(i = 0; i < kstack_depth_to_print; i++) {
- if (((long) stack & (THREAD_SIZE-1)) == 0)
+ if (kstack_end(stack))
break;
if (i && ((i % 8) == 0))
printk("\n ");
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.6.0-test7 DEBUG_PAGEALLOC oops
2003-10-11 17:34 ` Manfred Spraul
@ 2003-10-12 5:11 ` Mike Galbraith
2003-10-12 6:58 ` Manfred Spraul
0 siblings, 1 reply; 18+ messages in thread
From: Mike Galbraith @ 2003-10-12 5:11 UTC (permalink / raw)
To: Manfred Spraul; +Cc: Zwane Mwaikambo, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 566 bytes --]
At 07:34 PM 10/11/2003 +0200, Manfred Spraul wrote:
>Mike Galbraith wrote:
>
>>
>>Ok, you want do_IRQ assembler, correct?
>
>No - I need the function that was interrupted by common_interrupt.
Aha. (/me sees light [a wee bit late])
>I found only one valid function pointer in the stack dump above
>common_interrupt:
>
>0xc0112a13, EBP=0xc0349f88
>
>Could you look it up in your System.map?
It's in apm_bios_call_simple()...
>Which power management do you use? apm or acpi?
...and booting with apm=off cures the explosions. (uhoh... crud bios?)
-Mike
[-- Attachment #2: Type: text/plain, Size: 5305 bytes --]
00000134 <apm_bios_call_simple>:
134: 55 push %ebp
135: b8 00 e0 ff ff mov $0xffffe000,%eax
13a: 89 e5 mov %esp,%ebp
13c: 21 e0 and %esp,%eax
13e: 83 ec 1c sub $0x1c,%esp
141: 57 push %edi
142: 56 push %esi
143: 53 push %ebx
}
/**
* apm_bios_call_simple - make a simple APM BIOS 32bit call
* @func: APM function to invoke
* @ebx_in: EBX register value for BIOS call
* @ecx_in: ECX register value for BIOS call
* @eax: EAX register on return from the BIOS call
*
* Make a BIOS call that does only returns one value, or just status.
* If there is an error, then the error code is returned in AH
* (bits 8-15 of eax) and this function returns non-zero. This is
* used for simpler BIOS operations. This call may hold interrupts
* off for a long time on some laptops.
*/
static u8 apm_bios_call_simple(u32 func, u32 ebx_in, u32 ecx_in, u32 *eax)
{
144: 8b 5d 0c mov 0xc(%ebp),%ebx
147: 8b 4d 10 mov 0x10(%ebp),%ecx
u8 error;
APM_DECL_SEGS
unsigned long flags;
cpumask_t cpus;
int cpu;
struct desc_struct save_desc_40;
cpus = apm_save_cpus();
cpu = get_cpu();
14a: ff 40 14 incl 0x14(%eax)
save_desc_40 = cpu_gdt_table[cpu][0x40 / 8];
14d: a1 40 00 00 00 mov 0x40,%eax
152: 8b 15 44 00 00 00 mov 0x44,%edx
158: 89 45 f0 mov %eax,0xfffffff0(%ebp)
15b: 89 55 f4 mov %edx,0xfffffff4(%ebp)
cpu_gdt_table[cpu][0x40 / 8] = bad_bios_desc;
15e: a1 34 00 00 00 mov 0x34,%eax
163: 8b 15 38 00 00 00 mov 0x38,%edx
169: a3 40 00 00 00 mov %eax,0x40
16e: 89 15 44 00 00 00 mov %edx,0x44
local_save_flags(flags);
174: 9c pushf
175: 8f 45 ec popl 0xffffffec(%ebp)
APM_DO_CLI;
178: 83 3d 20 00 00 00 00 cmpl $0x0,0x20
17f: 74 03 je 184 <apm_bios_call_simple+0x50>
181: fb sti
182: eb 01 jmp 185 <apm_bios_call_simple+0x51>
184: fa cli
APM_DO_SAVE_SEGS;
185: 8c 65 fc movl %fs,0xfffffffc(%ebp)
188: 8c 6d f8 movl %gs,0xfffffff8(%ebp)
/*
* N.B. We do NOT need a cld after the BIOS call
* because we always save and restore the flags.
*/
__asm__ __volatile__(APM_DO_ZERO_SEGS
18b: 8b 45 08 mov 0x8(%ebp),%eax
18e: 1e push %ds
18f: 06 push %es
190: 31 d2 xor %edx,%edx
192: 8e da mov %edx,%ds
194: 8e c2 mov %edx,%es
196: 8e e2 mov %edx,%fs
198: 8e ea mov %edx,%gs
19a: 57 push %edi
19b: 55 push %ebp
19c: 2e ff 1d 20 00 00 00 lcall *%cs:0x20
1a3: 0f 92 c3 setb %bl <== 0xc0112a13 is HERE
1a6: 5d pop %ebp
1a7: 5f pop %edi
1a8: 07 pop %es
1a9: 1f pop %ds
1aa: 89 45 e8 mov %eax,0xffffffe8(%ebp)
1ad: 8b 45 14 mov 0x14(%ebp),%eax
1b0: 8b 55 e8 mov 0xffffffe8(%ebp),%edx
1b3: 89 10 mov %edx,(%eax)
error = apm_bios_call_simple_asm(func, ebx_in, ecx_in, eax);
APM_DO_RESTORE_SEGS;
1b5: 8e 65 fc movl 0xfffffffc(%ebp),%fs
1b8: 8e 6d f8 movl 0xfffffff8(%ebp),%gs
local_irq_restore(flags);
1bb: ff 75 ec pushl 0xffffffec(%ebp)
1be: 9d popf
cpu_gdt_table[smp_processor_id()][0x40 / 8] = save_desc_40;
1bf: 8b 45 f0 mov 0xfffffff0(%ebp),%eax
1c2: 8b 55 f4 mov 0xfffffff4(%ebp),%edx
1c5: a3 40 00 00 00 mov %eax,0x40
1ca: 89 15 44 00 00 00 mov %edx,0x44
/* how to get the thread information struct from C */
static inline struct thread_info *current_thread_info(void)
{
struct thread_info *ti;
__asm__("andl %%esp,%0; ":"=r" (ti) : "0" (~8191UL));
1d0: b8 00 e0 ff ff mov $0xffffe000,%eax
1d5: 21 e0 and %esp,%eax
put_cpu();
1d7: ff 48 14 decl 0x14(%eax)
#endif
static __inline__ int constant_test_bit(int nr, const volatile unsigned long * addr)
{
return ((1UL << (nr & 31)) & (((const volatile unsigned int *) addr)[nr >> 5])) != 0;
1da: 8b 40 08 mov 0x8(%eax),%eax
1dd: a8 08 test $0x8,%al
1df: 74 05 je 1e6 <apm_bios_call_simple+0xb2>
1e1: e8 fc ff ff ff call 1e2 <apm_bios_call_simple+0xae>
apm_restore_cpus(cpus);
return error;
1e6: 0f b6 c3 movzbl %bl,%eax
1e9: 5b pop %ebx
1ea: 5e pop %esi
1eb: 5f pop %edi
1ec: 89 ec mov %ebp,%esp
1ee: 5d pop %ebp
1ef: c3 ret
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.6.0-test7 DEBUG_PAGEALLOC oops
2003-10-11 15:52 ` Mike Galbraith
@ 2003-10-11 17:34 ` Manfred Spraul
2003-10-12 5:11 ` Mike Galbraith
0 siblings, 1 reply; 18+ messages in thread
From: Manfred Spraul @ 2003-10-11 17:34 UTC (permalink / raw)
To: Mike Galbraith; +Cc: Zwane Mwaikambo, linux-kernel
Mike Galbraith wrote:
>
> Ok, you want do_IRQ assembler, correct?
No - I need the function that was interrupted by common_interrupt.
I found only one valid function pointer in the stack dump above
common_interrupt:
0xc0112a13, EBP=0xc0349f88
Could you look it up in your System.map?
Which power management do you use? apm or acpi?
--
Manfred
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.6.0-test7 DEBUG_PAGEALLOC oops
2003-10-11 12:06 ` Manfred Spraul
@ 2003-10-11 15:52 ` Mike Galbraith
2003-10-11 17:34 ` Manfred Spraul
0 siblings, 1 reply; 18+ messages in thread
From: Mike Galbraith @ 2003-10-11 15:52 UTC (permalink / raw)
To: Manfred Spraul; +Cc: Zwane Mwaikambo, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 721 bytes --]
At 02:06 PM 10/11/2003 +0200, Manfred Spraul wrote:
>I'd increase kstack_depth_to_print to 140. Do not increase it too much,
>otherwise it will oops due to the misaligned stack.
>Then check the EBP values: They are pushed after the return address. The
>return addresses are listed in the Call Trace section.
>Example:
>0xc01316aa8 pushes 0xc0349dd6 -> odd.
>0xc0131b6c pushes 0xc0349de6 -> odd.
>
>0xc0131b3e pushes c0349e02 -> odd.
>
>Proper values for EBP are multiples of 4. One you find where the stack got
>misaligned, disassemble the offending function (or send me the .o file)
Ok, you want do_IRQ assembler, correct?
fwiw, building with gcc-3.3 didn't help, nor did disabling frame pointers.
-Mike
[-- Attachment #2: Type: text/plain, Size: 7563 bytes --]
Unable to handle kernel paging request at virtual address c034a000
printing eip:
c0134d5a
*pde = 00102027
*pte = 0034a000
Oops: 0000 [#1]
CPU: 0
EIP: 0060:[<c0134d5a>] Not tainted
EFLAGS: 00010002
EIP is at store_stackinfo+0x4e/0x80
eax: 00000000 ebx: c7436f88 ecx: c0301390 edx: c030138c
esi: c0349ffe edi: 017e0008 ebp: c0349d46 esp: c0349d36
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, threadinfo=c0348000 task=c02fcbe0)
Stack: c74b59c4 c1173488 c7436000 00000070 c0349d76 c0136aa8 c1173488 c7436000
0000006b c72b1bb4 c72b1bb4 c7fecafc 00001000 c0131b6c c7f75f78 00000086
c0349d86 c0131b6c c1173488 c7436f38 c0349da2 c0131b3e c7436f38 c1173488
c72b1bb4 c72b1bb4 00000001 c0349db6 c014cf76 c7436f38 c7fecafc c7436f2c
c0349dc2 c014d158 c72b1bb4 c0349dda c016548a c72b1bb4 c72b1bb4 00000000
c04a070c c0349df6 c014d830 c72b1bb4 00010000 00000000 00010000 c72b1bb4
c0349e26 c01e45af c72b1bb4 00010000 00000000 c7467f58 00000080 c04a070c
00000000 00000000 00000000 00000000 c0349e3a c01e4697 c7467f58 00000001
00010000 c0349e62 c01f0b7d c7467f58 00000001 00000080 c04a070c 00000000
c04a070c 00000001 00000296 c0349e76 c01f9f68 c04a070c 00000001 00000080
c0349e96 c0201df8 c04a070c 00000001 00000080 c0348000 c7af5ef8 50005ef8
c0349eba c01f2024 c04a070c c7b2ca48 04000001 00000000 c0201d98 c04a0660
00000292 c0349eda c010a423 0000000e c7af5ef8 c0349f0a c0348000 c0348000
0000000e c0349f02 c010a70e 0000000e c0349f0a c7b2ca48 00000000 00000010
c0348000 c7b2ca48 c0346bc0 000000c0 c0109014 00000000 00000000 00000000
00000010 c0348000 000000c0 00005305 9f280000 9f280000 ffffff0e 00008026
000000b8 00000216 803600c0 9f8800b8 2a13c034 0060c011 9f880000 8000c034
007bc034 007b0000 9c2c0000 0010fffb
Call Trace:
[<c0136aa8>] kmem_cache_free+0x218/0x294
[<c0131b6c>] mempool_free_slab+0x10/0x14
[<c0131b6c>] mempool_free_slab+0x10/0x14
[<c0131b3e>] mempool_free+0x7a/0x84
[<c014cf76>] bio_destructor+0x36/0x4c
[<c014d158>] bio_put+0x2c/0x30
[<c016548a>] mpage_end_io_read+0x6a/0x78
[<c014d830>] bio_endio+0x50/0x5c
[<c01e45af>] __end_that_request_first+0xef/0x1c0
[<c01e4697>] end_that_request_first+0x17/0x1c
[<c01f0b7d>] ide_end_request+0x8d/0x124
[<c01f9f68>] default_end_request+0x14/0x18
[<c0201df8>] ide_dma_intr+0x60/0x98
[<c01f2024>] ide_intr+0x108/0x17c
[<c0201d98>] ide_dma_intr+0x0/0x98
[<c010a423>] handle_IRQ_event+0x2b/0x58
[<c010a70e>] do_IRQ+0x92/0x130
[<c0109014>] common_interrupt+0x18/0x20
000004ec <do_IRQ>:
4ec: 55 push %ebp
4ed: 89 e5 mov %esp,%ebp
4ef: 83 ec 08 sub $0x8,%esp
4f2: 57 push %edi
4f3: 56 push %esi
4f4: 53 push %ebx
4f5: be 00 e0 ff ff mov $0xffffe000,%esi
4fa: 0f b6 7d 2c movzbl 0x2c(%ebp),%edi
4fe: 21 e6 and %esp,%esi
500: 89 fb mov %edi,%ebx
502: c1 e3 05 shl $0x5,%ebx
505: 8d 83 00 00 00 00 lea 0x0(%ebx),%eax
50b: 89 45 fc mov %eax,0xfffffffc(%ebp)
50e: 81 46 14 00 00 01 00 addl $0x10000,0x14(%esi)
515: ff 04 bd 1c 00 00 00 incl 0x1c(,%edi,4)
51c: ff 46 14 incl 0x14(%esi)
51f: 8b 83 04 00 00 00 mov 0x4(%ebx),%eax
525: 57 push %edi
526: 8b 40 14 mov 0x14(%eax),%eax
529: ff d0 call *%eax
52b: 8b 83 00 00 00 00 mov 0x0(%ebx),%eax
531: 83 c4 04 add $0x4,%esp
534: 24 d7 and $0xd7,%al
536: c7 45 f8 00 00 00 00 movl $0x0,0xfffffff8(%ebp)
53d: 0c 04 or $0x4,%al
53f: a8 03 test $0x3,%al
541: 75 0d jne 550 <do_IRQ+0x64>
543: 8b 93 08 00 00 00 mov 0x8(%ebx),%edx
549: 24 fb and $0xfb,%al
54b: 89 55 f8 mov %edx,0xfffffff8(%ebp)
54e: 0c 01 or $0x1,%al
550: 89 83 00 00 00 00 mov %eax,0x0(%ebx)
556: 83 7d f8 00 cmpl $0x0,0xfffffff8(%ebp)
55a: 74 60 je 5bc <do_IRQ+0xd0>
55c: 89 f3 mov %esi,%ebx
55e: 89 f6 mov %esi,%esi
560: ff 4b 14 decl 0x14(%ebx)
563: 8b 43 08 mov 0x8(%ebx),%eax
566: a8 08 test $0x8,%al
568: 74 06 je 570 <do_IRQ+0x84>
56a: e8 fc ff ff ff call 56b <do_IRQ+0x7f>
56f: 90 nop
570: 8b 45 f8 mov 0xfffffff8(%ebp),%eax
573: 8d 55 08 lea 0x8(%ebp),%edx
576: 50 push %eax
577: 52 push %edx
578: 57 push %edi
579: e8 fc ff ff ff call 57a <do_IRQ+0x8e>
57e: 83 c4 0c add $0xc,%esp
581: ff 43 14 incl 0x14(%ebx)
584: 83 3d 04 00 00 00 00 cmpl $0x0,0x4
58b: 75 13 jne 5a0 <do_IRQ+0xb4>
58d: 50 push %eax
58e: 8b 45 fc mov 0xfffffffc(%ebp),%eax
591: 50 push %eax
592: 57 push %edi
593: e8 e0 fd ff ff call 378 <note_interrupt>
598: 83 c4 0c add $0xc,%esp
59b: 90 nop
59c: 8d 74 26 00 lea 0x0(%esi,1),%esi
5a0: 8b 55 fc mov 0xfffffffc(%ebp),%edx
5a3: 8b 02 mov (%edx),%eax
5a5: 89 c2 mov %eax,%edx
5a7: a8 04 test $0x4,%al
5a9: 74 0a je 5b5 <do_IRQ+0xc9>
5ab: 83 e2 fb and $0xfffffffb,%edx
5ae: 8b 45 fc mov 0xfffffffc(%ebp),%eax
5b1: 89 10 mov %edx,(%eax)
5b3: eb ab jmp 560 <do_IRQ+0x74>
5b5: 24 fe and $0xfe,%al
5b7: 8b 55 fc mov 0xfffffffc(%ebp),%edx
5ba: 89 02 mov %eax,(%edx)
5bc: 8b 55 fc mov 0xfffffffc(%ebp),%edx
5bf: 8b 42 04 mov 0x4(%edx),%eax
5c2: 57 push %edi
5c3: 8b 40 18 mov 0x18(%eax),%eax
5c6: ff d0 call *%eax
5c8: 83 c4 04 add $0x4,%esp
5cb: bb 00 e0 ff ff mov $0xffffe000,%ebx
5d0: 21 e3 and %esp,%ebx
5d2: ff 4b 14 decl 0x14(%ebx)
5d5: 8b 43 08 mov 0x8(%ebx),%eax
5d8: a8 08 test $0x8,%al
5da: 74 05 je 5e1 <do_IRQ+0xf5>
5dc: e8 fc ff ff ff call 5dd <do_IRQ+0xf1>
5e1: 8b 43 14 mov 0x14(%ebx),%eax
5e4: 05 01 00 ff ff add $0xffff0001,%eax
5e9: 89 43 14 mov %eax,0x14(%ebx)
5ec: a9 00 ff ff 00 test $0xffff00,%eax
5f1: 75 0e jne 601 <do_IRQ+0x115>
5f3: 83 3d 00 00 00 00 00 cmpl $0x0,0x0
5fa: 74 05 je 601 <do_IRQ+0x115>
5fc: e8 fc ff ff ff call 5fd <do_IRQ+0x111>
601: b8 00 e0 ff ff mov $0xffffe000,%eax
606: 21 e0 and %esp,%eax
608: ff 48 14 decl 0x14(%eax)
60b: b8 01 00 00 00 mov $0x1,%eax
610: 8d 65 ec lea 0xffffffec(%ebp),%esp
613: 5b pop %ebx
614: 5e pop %esi
615: 5f pop %edi
616: 89 ec mov %ebp,%esp
618: 5d pop %ebp
619: c3 ret
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.6.0-test7 DEBUG_PAGEALLOC oops
2003-10-11 11:15 ` Mike Galbraith
@ 2003-10-11 12:06 ` Manfred Spraul
2003-10-11 15:52 ` Mike Galbraith
0 siblings, 1 reply; 18+ messages in thread
From: Manfred Spraul @ 2003-10-11 12:06 UTC (permalink / raw)
To: Mike Galbraith; +Cc: Zwane Mwaikambo, linux-kernel
Mike Galbraith wrote:
>>
>>> eax: 00000000 ebx: c7802f98 ecx: c0301390 edx: c030138c
>>> esi: c0349ffe edi: 017e0008 ebp: c0349da6 esp: c0349d96
>>> ds: 007b es: 007b ss: 0068
>>> Process swapper (pid: 0, threadinfo=c0348000 task=c02fcbe0)
>>
>> The esp value is sane, the stack is at 0xc0348000, and the fault is
>> at 'a000: just behind the end of the stack.
>
I'm blind. The esp value is the culprit:
It's not 32-bit aligned. Someone misaligned the stack, and thus
if(stack_ptr & (THREAD_SIZE-1))
didn't notice the end of the stack.
The generated assembly of store_slabinfo is correct:
1d2: f7 c6 ff 1f 00 00 test $0x1fff,%esi
Check sptr against THREAD_SIZE -1
1d8: 74 21 je 1fb <store_stackinfo+0x6f>
1da: 8b 3e mov (%esi),%edi
And load *sptr.
>> It looks like store stackinfo accesses memory behind the end of the
>> stack.
>
>
> Yeah, I'm trying to figure out why. The below (if dang mailer
> actually inlines it) kludge allows me to boot, so I suppose I need to
> ponder addr wrt _stext and _etext.
Wrong direction: Right now it crashes because it runs over the end of
the stack.
With your patch applied, the allocated object is too small to hold all
entries on the stack, and thus store_stackinfo aborts before it runs
into the next page.
I'd increase kstack_depth_to_print to 140. Do not increase it too much,
otherwise it will oops due to the misaligned stack.
Then check the EBP values: They are pushed after the return address. The
return addresses are listed in the Call Trace section.
Example:
0xc01316aa8 pushes 0xc0349dd6 -> odd.
0xc0131b6c pushes 0xc0349de6 -> odd.
0xc0131b3e pushes c0349e02 -> odd.
Proper values for EBP are multiples of 4. One you find where the stack got misaligned, disassemble the offending function (or send me the .o file)
--
Manfred
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.6.0-test7 DEBUG_PAGEALLOC oops
2003-10-11 9:37 2.6.0-test7 DEBUG_PAGEALLOC oops Manfred Spraul
@ 2003-10-11 11:15 ` Mike Galbraith
2003-10-11 12:06 ` Manfred Spraul
0 siblings, 1 reply; 18+ messages in thread
From: Mike Galbraith @ 2003-10-11 11:15 UTC (permalink / raw)
To: Manfred Spraul; +Cc: Zwane Mwaikambo, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1349 bytes --]
At 11:37 AM 10/11/2003 +0200, Manfred Spraul wrote:
>Mike wrote:
>
>>Unable to handle kernel paging request at virtual address c034a000
>>printing eip:
>>c0134d5a
>>*pde = 00102027
>>*pte = 0034a000
>Fault trying to read from address 0xc034a000: the page is not mapped.
>
>>Oops: 0000 [#1]
>>CPU: 0
>>EIP: 0060:[<c0134d5a>] Not tainted
>>EFLAGS: 00010002
>>EIP is at store_stackinfo+0x4e/0x80
>In store_stackinfo: the function stores a backtrace of the last
>kmem_cache_free caller in the object - might be useful, and the memory is
>not used.
>
>>eax: 00000000 ebx: c7802f98 ecx: c0301390 edx: c030138c
>>esi: c0349ffe edi: 017e0008 ebp: c0349da6 esp: c0349d96
>>ds: 007b es: 007b ss: 0068
>>Process swapper (pid: 0, threadinfo=c0348000 task=c02fcbe0)
>The esp value is sane, the stack is at 0xc0348000, and the fault is at
>'a000: just behind the end of the stack.
>I assume the fauling line is
> svalue = *sptr++;
Exactly.
>It looks like store stackinfo accesses memory behind the end of the stack.
Yeah, I'm trying to figure out why. The below (if dang mailer actually
inlines it) kludge allows me to boot, so I suppose I need to ponder addr
wrt _stext and _etext.
>Which gcc version do you use? Could you send me mm/slab.o?
gcc-2.95.3. slab.o coming via private mail.
-Mike
[-- Attachment #2: Type: text/plain, Size: 467 bytes --]
--- mm/slab.c.org Sat Oct 11 12:25:24 2003
+++ mm/slab.c Sat Oct 11 12:26:02 2003
@@ -864,12 +864,11 @@
while (((long) sptr & (THREAD_SIZE-1)) != 0) {
svalue = *sptr++;
- if (kernel_text_address(svalue)) {
+ if (kernel_text_address(svalue))
*addr++=svalue;
- size -= sizeof(unsigned long);
- if (size <= sizeof(unsigned long))
- break;
- }
+ size -= sizeof(unsigned long);
+ if (size <= sizeof(unsigned long))
+ break;
}
}
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: 2.6.0-test7 DEBUG_PAGEALLOC oops
@ 2003-10-11 9:37 Manfred Spraul
2003-10-11 11:15 ` Mike Galbraith
0 siblings, 1 reply; 18+ messages in thread
From: Manfred Spraul @ 2003-10-11 9:37 UTC (permalink / raw)
To: Mike Galbraith; +Cc: Zwane Mwaikambo, linux-kernel
Mike wrote:
>Unable to handle kernel paging request at virtual address c034a000
> printing eip:
>c0134d5a
>*pde = 00102027
>*pte = 0034a000
>
Fault trying to read from address 0xc034a000: the page is not mapped.
>Oops: 0000 [#1]
>CPU: 0
>EIP: 0060:[<c0134d5a>] Not tainted
>EFLAGS: 00010002
>EIP is at store_stackinfo+0x4e/0x80
>
In store_stackinfo: the function stores a backtrace of the last
kmem_cache_free caller in the object - might be useful, and the memory
is not used.
>eax: 00000000 ebx: c7802f98 ecx: c0301390 edx: c030138c
>esi: c0349ffe edi: 017e0008 ebp: c0349da6 esp: c0349d96
>ds: 007b es: 007b ss: 0068
>Process swapper (pid: 0, threadinfo=c0348000 task=c02fcbe0)
>
The esp value is sane, the stack is at 0xc0348000, and the fault is at
'a000: just behind the end of the stack.
I assume the fauling line is
svalue = *sptr++;
It looks like store stackinfo accesses memory behind the end of the stack.
Which gcc version do you use? Could you send me mm/slab.o?
--
Manfred
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2003-10-12 22:38 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-10-09 13:04 2.6.0-test7 oops in proc_pid_stat Olaf Hering
2003-10-09 22:04 ` Linus Torvalds
2003-10-10 6:50 ` 2.6.0-test7 DEBUG_PAGEALLOC oops Mike Galbraith
2003-10-10 16:52 ` Zwane Mwaikambo
2003-10-11 7:01 ` Mike Galbraith
2003-10-11 7:03 ` Zwane Mwaikambo
2003-10-10 7:28 ` 2.6.0-test7 oops in proc_pid_stat Olaf Hering
2003-10-11 9:37 2.6.0-test7 DEBUG_PAGEALLOC oops Manfred Spraul
2003-10-11 11:15 ` Mike Galbraith
2003-10-11 12:06 ` Manfred Spraul
2003-10-11 15:52 ` Mike Galbraith
2003-10-11 17:34 ` Manfred Spraul
2003-10-12 5:11 ` Mike Galbraith
2003-10-12 6:58 ` Manfred Spraul
2003-10-12 8:52 ` Mike Galbraith
2003-10-12 12:08 ` Thomas Molina
2003-10-12 14:13 ` Thomas Molina
2003-10-12 22:36 ` Thomas Molina
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.