* latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?)
@ 2008-08-21 17:28 Vegard Nossum
2008-08-21 18:10 ` Rafael J. Wysocki
0 siblings, 1 reply; 21+ messages in thread
From: Vegard Nossum @ 2008-08-21 17:28 UTC (permalink / raw)
To: Rafael J. Wysocki; +Cc: Linux Kernel Mailing List
On Thu, Aug 21, 2008 at 6:49 PM, Vegard Nossum <vegard.nossum@gmail.com> wrote:
> But I think it would be a lot more useful to see _everything_. Why
> doesn't output to ttyS0 work? It works with tty0... I have tried
> various combinations (leaving out console=tty0, leaving out
> earlyprintk, using netconsole, kexec crashdump), but nothing seems to
> be able to help. Do you have any suggestions?
I did it!
With a little debug patch, I got the oops to ttyS0, but with each
printk() duplicated (I removed some of them). This is from resume:
BUG: unable to handle kernel BUG: unable to handle kernel paging
requestpaging request at 00200200
at 00200200
IP:IP: [<c038accc>] list_del+0xc/0x90
[<c038accc>] list_del+0xc/0x90
*pdpt = 00000000331a0001 *pde = 0000000000000000
Oops: 0000 [#1] Oops: 0000 [#1] PREEMPT PREEMPT SMP SMP
DEBUG_PAGEALLOCDEBUG_PAGEALLOC
Pid: 3473, comm: bash Not tainted (2.6.27-rc4-00003-ga798564-dirty #30)
EIP: 0060:[<c038accc>] EFLAGS: 00210082 CPU: 0
EIP is at list_del+0xc/0x90
EAX: 00200200 EBX: f77fcc00 ECX: 00000001 EDX: f31a6000
ESI: c088e814 EDI: c088e83c EBP: f31a7be8 ESP: f31a7bd0
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process bash (pid: 3473, ti=f31a6000 task=f30ea700 task.ti=f31a6000)
Stack: Stack: 00000000 00000000 00000002 00000002 00000000 00000000
c01b39be c01b39be f77cefa0 f77cefa0 f77fcbd0 f77fcbd0 f31a7bfc
f31a7bfc c01b39ef c01b39ef
f77cefa0 f77cefa0 00000000 00000000 c1133598 c1133598
f31a7c3c f31a7c3c c01b3ebc c01b3ebc c0b530c0 c0b530c0 f31a7c18
f31a7c18 c015b60e c015b60e
f31a7c18 f31a7c18 ffffffff ffffffff 00000020 00000020
c088e800 c088e800 00200046 00200046 00000000 00000000 00000000
00000000 00000000 00000000
Call Trace:
[<c01b39be>] [<c01b39be>] ? ? get_partial_node+0x3e/0xc0
[<c01b39ef>] [<c01b39ef>] ? ? get_partial_node+0x6f/0xc0
[<c01b3ebc>] [<c01b3ebc>] ? ? __slab_alloc+0x11c/0x4e0
[<c015b60e>] [<c015b60e>] ? ? get_lock_stats+0x1e/0x50
[<c01b5ffc>] [<c01b5ffc>] ? ? __kmalloc+0x11c/0x130
[<c03ceeb2>] [<c03ceeb2>] ? ? tty_buffer_request_room+0xe2/0x130
[<c03ceeb2>] [<c03ceeb2>] ? ? tty_buffer_request_room+0xe2/0x130
[<c03ceeb2>] [<c03ceeb2>] ? ? tty_buffer_request_room+0xe2/0x130
[<c03d131d>] [<c03d131d>] ? ? tty_insert_flip_string_flags+0x2d/0xa0
[<c03efcf1>] [<c03efcf1>] ? ? receive_chars+0x161/0x290
[<c03f0d24>] [<c03f0d24>] ? ? serial8250_interrupt+0x134/0x150
[<c017db28>] [<c017db28>] ? ? handle_IRQ_event+0x28/0x70
[<c017f0ef>] [<c017f0ef>] ? ? handle_edge_irq+0xaf/0x140
[<c0108138>] [<c0108138>] ? ? do_IRQ+0x48/0xa0
[<c037b7a4>] [<c037b7a4>] ? ? trace_hardirqs_off_thunk+0xc/0x18
[<c0105a3c>] [<c0105a3c>] ? ? common_interrupt+0x28/0x30
[<c013b2d1>] [<c013b2d1>] ? ? vprintk+0x151/0x3c0
[<c015e884>] [<c015e884>] ? ? trace_hardirqs_on_caller+0xd4/0x160
[<c015e91b>] [<c015e91b>] ? ? trace_hardirqs_on+0xb/0x10
[<c0687983>] [<c0687983>] ? ? _spin_unlock_irqrestore+0x43/0x70
[<c038ef8b>] [<c038ef8b>] ? ? pci_bus_read_config_byte+0x5b/0x70
[<c013b55b>] [<c013b55b>] ? ? printk+0x1b/0x20
[<c059a748>] [<c059a748>] ? ? pcibios_set_master+0x68/0xa0
[<c03925c4>] [<c03925c4>] ? ? pci_set_master+0x54/0x60
[<c0513d9c>] [<c0513d9c>] ? ? usb_hcd_pci_resume+0x4c/0xf0
[<c015e91b>] [<c015e91b>] ? ? trace_hardirqs_on+0xb/0x10
[<c0393cf6>] [<c0393cf6>] ? ? pci_legacy_resume+0x16/0x30
[<c0393d5a>] [<c0393d5a>] ? ? pci_pm_restore+0x4a/0x60
[<c03f83e5>] [<c03f83e5>] ? ? pm_op+0x115/0x130
[<c03f8986>] [<c03f8986>] ? ? device_resume+0xd6/0x380
[<c0168791>] [<c0168791>] ? ? hibernation_snapshot+0xa1/0x220
[<c013b55b>] [<c013b55b>] ? ? printk+0x1b/0x20
[<c01689f0>] [<c01689f0>] ? ? hibernate+0xe0/0x180
[<c01674a0>] [<c01674a0>] ? ? state_store+0x0/0xd0
[<c016755f>] [<c016755f>] ? ? state_store+0xbf/0xd0
[<c01674a0>] [<c01674a0>] ? ? state_store+0x0/0xd0
[<c0375ef4>] [<c0375ef4>] ? ? kobj_attr_store+0x24/0x30
[<c01fa432>] [<c01fa432>] ? ? sysfs_write_file+0xa2/0x100
[<c01bbf06>] [<c01bbf06>] ? ? vfs_write+0x96/0x130
[<c01fa390>] [<c01fa390>] ? ? sysfs_write_file+0x0/0x100
[<c01bc44d>] [<c01bc44d>] ? ? sys_write+0x3d/0x70
[<c0104f3b>] [<c0104f3b>] ? ? sysenter_do_call+0x12/0x3f
=======================
Code: Code: e8 e8 01 01 89 89 44 44 24 24 04 04 e8 e8 94 94 08 08 db
db ff ff 8b 8b 55 55 04 04 b8 b8 80 80 45 45 7a 7a c0 c0 e8 e8 e7 e7
c2 c2 dd dd ff ff e8 e8
a2 a2 c7 c7 d7 d7 ff ff eb eb a9 a9 55 55 89 89 e5 e5 53 53 89 89 c3
c3 83 83 ec ec 14 14 8b 8b 40 40 04 04 <8b> <8b> 00 00 39 39 d8 d8 75
75 24 24 8b 8b 13 13
8b 8b 42 42 04 04 39 39 d8 d8 75 75 41 41 8b 8b 43 43 04 04 89 89 42 42 04 04
EIP: [<c038accc>] EIP: [<c038accc>] list_del+0xc/0x90list_del+0xc/0x90
SS:ESP 0068:f31a7bd0
Kernel panic - not syncing: Fatal exception in interrupt
The EIP corresponds to the first line of list_del, lib/list_debug.c:46:
void list_del(struct list_head *entry)
{
WARN(entry->prev->next != entry,
I actually have one theory: Data arrives on the serial port at the
wrong moment, so interrupt happens before everything is
restored/resumed. You can see the interrupt in the stack trace. Does
this seem possible?
Vegard
--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-21 17:28 latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) Vegard Nossum @ 2008-08-21 18:10 ` Rafael J. Wysocki 2008-08-21 18:21 ` Vegard Nossum 0 siblings, 1 reply; 21+ messages in thread From: Rafael J. Wysocki @ 2008-08-21 18:10 UTC (permalink / raw) To: Vegard Nossum; +Cc: Linux Kernel Mailing List On Thursday, 21 of August 2008, Vegard Nossum wrote: > On Thu, Aug 21, 2008 at 6:49 PM, Vegard Nossum <vegard.nossum@gmail.com> wrote: > > But I think it would be a lot more useful to see _everything_. Why > > doesn't output to ttyS0 work? It works with tty0... I have tried > > various combinations (leaving out console=tty0, leaving out > > earlyprintk, using netconsole, kexec crashdump), but nothing seems to > > be able to help. Do you have any suggestions? > > I did it! > > With a little debug patch, I got the oops to ttyS0, What exactly did you do to achieve that? > but with each printk() duplicated (I removed some of them). This is from resume: > > BUG: unable to handle kernel BUG: unable to handle kernel paging > requestpaging request at 00200200 > at 00200200 > IP:IP: [<c038accc>] list_del+0xc/0x90 > [<c038accc>] list_del+0xc/0x90 > *pdpt = 00000000331a0001 *pde = 0000000000000000 > Oops: 0000 [#1] Oops: 0000 [#1] PREEMPT PREEMPT SMP SMP > DEBUG_PAGEALLOCDEBUG_PAGEALLOC > Pid: 3473, comm: bash Not tainted (2.6.27-rc4-00003-ga798564-dirty #30) > EIP: 0060:[<c038accc>] EFLAGS: 00210082 CPU: 0 > EIP is at list_del+0xc/0x90 > EAX: 00200200 EBX: f77fcc00 ECX: 00000001 EDX: f31a6000 > ESI: c088e814 EDI: c088e83c EBP: f31a7be8 ESP: f31a7bd0 > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 > Process bash (pid: 3473, ti=f31a6000 task=f30ea700 task.ti=f31a6000) > Stack: Stack: 00000000 00000000 00000002 00000002 00000000 00000000 > c01b39be c01b39be f77cefa0 f77cefa0 f77fcbd0 f77fcbd0 f31a7bfc > f31a7bfc c01b39ef c01b39ef > f77cefa0 f77cefa0 00000000 00000000 c1133598 c1133598 > f31a7c3c f31a7c3c c01b3ebc c01b3ebc c0b530c0 c0b530c0 f31a7c18 > f31a7c18 c015b60e c015b60e > f31a7c18 f31a7c18 ffffffff ffffffff 00000020 00000020 > c088e800 c088e800 00200046 00200046 00000000 00000000 00000000 > 00000000 00000000 00000000 > Call Trace: > [<c01b39be>] [<c01b39be>] ? ? get_partial_node+0x3e/0xc0 > [<c01b39ef>] [<c01b39ef>] ? ? get_partial_node+0x6f/0xc0 > [<c01b3ebc>] [<c01b3ebc>] ? ? __slab_alloc+0x11c/0x4e0 > [<c015b60e>] [<c015b60e>] ? ? get_lock_stats+0x1e/0x50 > [<c01b5ffc>] [<c01b5ffc>] ? ? __kmalloc+0x11c/0x130 > [<c03ceeb2>] [<c03ceeb2>] ? ? tty_buffer_request_room+0xe2/0x130 > [<c03ceeb2>] [<c03ceeb2>] ? ? tty_buffer_request_room+0xe2/0x130 > [<c03ceeb2>] [<c03ceeb2>] ? ? tty_buffer_request_room+0xe2/0x130 > [<c03d131d>] [<c03d131d>] ? ? tty_insert_flip_string_flags+0x2d/0xa0 > [<c03efcf1>] [<c03efcf1>] ? ? receive_chars+0x161/0x290 > [<c03f0d24>] [<c03f0d24>] ? ? serial8250_interrupt+0x134/0x150 > [<c017db28>] [<c017db28>] ? ? handle_IRQ_event+0x28/0x70 > [<c017f0ef>] [<c017f0ef>] ? ? handle_edge_irq+0xaf/0x140 > [<c0108138>] [<c0108138>] ? ? do_IRQ+0x48/0xa0 > [<c037b7a4>] [<c037b7a4>] ? ? trace_hardirqs_off_thunk+0xc/0x18 > [<c0105a3c>] [<c0105a3c>] ? ? common_interrupt+0x28/0x30 > [<c013b2d1>] [<c013b2d1>] ? ? vprintk+0x151/0x3c0 > [<c015e884>] [<c015e884>] ? ? trace_hardirqs_on_caller+0xd4/0x160 > [<c015e91b>] [<c015e91b>] ? ? trace_hardirqs_on+0xb/0x10 > [<c0687983>] [<c0687983>] ? ? _spin_unlock_irqrestore+0x43/0x70 > [<c038ef8b>] [<c038ef8b>] ? ? pci_bus_read_config_byte+0x5b/0x70 > [<c013b55b>] [<c013b55b>] ? ? printk+0x1b/0x20 > [<c059a748>] [<c059a748>] ? ? pcibios_set_master+0x68/0xa0 > [<c03925c4>] [<c03925c4>] ? ? pci_set_master+0x54/0x60 > [<c0513d9c>] [<c0513d9c>] ? ? usb_hcd_pci_resume+0x4c/0xf0 > [<c015e91b>] [<c015e91b>] ? ? trace_hardirqs_on+0xb/0x10 > [<c0393cf6>] [<c0393cf6>] ? ? pci_legacy_resume+0x16/0x30 > [<c0393d5a>] [<c0393d5a>] ? ? pci_pm_restore+0x4a/0x60 > [<c03f83e5>] [<c03f83e5>] ? ? pm_op+0x115/0x130 > [<c03f8986>] [<c03f8986>] ? ? device_resume+0xd6/0x380 > [<c0168791>] [<c0168791>] ? ? hibernation_snapshot+0xa1/0x220 > [<c013b55b>] [<c013b55b>] ? ? printk+0x1b/0x20 > [<c01689f0>] [<c01689f0>] ? ? hibernate+0xe0/0x180 > [<c01674a0>] [<c01674a0>] ? ? state_store+0x0/0xd0 > [<c016755f>] [<c016755f>] ? ? state_store+0xbf/0xd0 > [<c01674a0>] [<c01674a0>] ? ? state_store+0x0/0xd0 > [<c0375ef4>] [<c0375ef4>] ? ? kobj_attr_store+0x24/0x30 > [<c01fa432>] [<c01fa432>] ? ? sysfs_write_file+0xa2/0x100 > [<c01bbf06>] [<c01bbf06>] ? ? vfs_write+0x96/0x130 > [<c01fa390>] [<c01fa390>] ? ? sysfs_write_file+0x0/0x100 > [<c01bc44d>] [<c01bc44d>] ? ? sys_write+0x3d/0x70 > [<c0104f3b>] [<c0104f3b>] ? ? sysenter_do_call+0x12/0x3f > ======================= > Code: Code: e8 e8 01 01 89 89 44 44 24 24 04 04 e8 e8 94 94 08 08 db > db ff ff 8b 8b 55 55 04 04 b8 b8 80 80 45 45 7a 7a c0 c0 e8 e8 e7 e7 > c2 c2 dd dd ff ff e8 e8 > a2 a2 c7 c7 d7 d7 ff ff eb eb a9 a9 55 55 89 89 e5 e5 53 53 89 89 c3 > c3 83 83 ec ec 14 14 8b 8b 40 40 04 04 <8b> <8b> 00 00 39 39 d8 d8 75 > 75 24 24 8b 8b 13 13 > 8b 8b 42 42 04 04 39 39 d8 d8 75 75 41 41 8b 8b 43 43 04 04 89 89 42 42 04 04 > EIP: [<c038accc>] EIP: [<c038accc>] list_del+0xc/0x90list_del+0xc/0x90 > SS:ESP 0068:f31a7bd0 > Kernel panic - not syncing: Fatal exception in interrupt > > The EIP corresponds to the first line of list_del, lib/list_debug.c:46: > > void list_del(struct list_head *entry) > { > WARN(entry->prev->next != entry, > > I actually have one theory: Data arrives on the serial port at the > wrong moment, so interrupt happens before everything is > restored/resumed. You can see the interrupt in the stack trace. Does > this seem possible? To me, it does, but the question is if that's not caused by the serial console itself ... Thanks, Rafael ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-21 18:10 ` Rafael J. Wysocki @ 2008-08-21 18:21 ` Vegard Nossum 2008-08-21 18:45 ` Vegard Nossum 2008-08-21 21:27 ` Pekka Enberg 0 siblings, 2 replies; 21+ messages in thread From: Vegard Nossum @ 2008-08-21 18:21 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Linux Kernel Mailing List On Thu, Aug 21, 2008 at 8:10 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Thursday, 21 of August 2008, Vegard Nossum wrote: >> On Thu, Aug 21, 2008 at 6:49 PM, Vegard Nossum <vegard.nossum@gmail.com> wrote: >> > But I think it would be a lot more useful to see _everything_. Why >> > doesn't output to ttyS0 work? It works with tty0... I have tried >> > various combinations (leaving out console=tty0, leaving out >> > earlyprintk, using netconsole, kexec crashdump), but nothing seems to >> > be able to help. Do you have any suggestions? >> >> I did it! >> >> With a little debug patch, I got the oops to ttyS0, > > What exactly did you do to achieve that? Actually, it was not my patch that did it. I appended acpi=off to the command line. I get similar oopses now, but they make it to the ttyS0. Don't ask me why this makes a difference. >> I actually have one theory: Data arrives on the serial port at the >> wrong moment, so interrupt happens before everything is >> restored/resumed. You can see the interrupt in the stack trace. Does >> this seem possible? > > To me, it does, but the question is if that's not caused by the serial console > itself ... I got a "proper" oops this time (brace yourself, it's long ;-)), and it does not have serial on the stack: BUG: unable to handle kernel paging request at 00100104 IP: [<c038ad65>] __list_add+0x15/0x90 *pdpt = 00000000325bb001 *pde = 0000000000000000 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Pid: 3597, comm: bash Not tainted (2.6.27-rc4-00003-ga798564-dirty #30) EIP: 0060:[<c038ad65>] EFLAGS: 00210082 CPU: 0 EIP is at __list_add+0x15/0x90 EAX: f7455764 EBX: 00100100 ECX: 00100100 EDX: f7455764 ESI: f7455764 EDI: f7455764 EBP: f25e7c18 ESP: f25e7bf4 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process bash (pid: 3597, ti=f25e6000 task=f25aa700 task.ti=f25e6000) Stack: f68120bc f6c00400 c0687133 00000000 00000002 00000000 f7455734 f7455764 f7455764 f25e7c40 c018b954 0000001f 00000000 f6c00400 f6c00444 00000001 f68120bc f68120b0 00200002 f25e7cb4 c018d317 f68120bc 00000000 f25aac0c Call Trace: [<c0687133>] ? _spin_lock+0x63/0x70 [<c018b954>] ? rmqueue_bulk+0x54/0x80 [<c018d317>] ? get_page_from_freelist+0x5a7/0x720 [<c01600ea>] ? __lock_acquire+0x27a/0xa00 [<c018dd50>] ? __alloc_pages_internal+0xa0/0x450 [<c01acd4b>] ? alloc_pages_current+0x7b/0xc0 [<c01b37fb>] ? new_slab+0x1bb/0x2d0 [<c0687877>] ? _spin_unlock+0x27/0x50 [<c01b40ca>] ? __slab_alloc+0x32a/0x4e0 [<c010b335>] ? native_sched_clock+0xb5/0x110 [<c01b4424>] ? kmem_cache_alloc+0xb4/0xe0 [<c018969e>] ? mempool_alloc_slab+0xe/0x10 [<c018969e>] ? mempool_alloc_slab+0xe/0x10 [<c018969e>] ? mempool_alloc_slab+0xe/0x10 [<c01897a1>] ? mempool_alloc+0x31/0xf0 [<c015e884>] ? trace_hardirqs_on_caller+0xd4/0x160 [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 [<c0368c7e>] ? get_request+0xae/0x2c0 [<c036935c>] ? get_request_wait+0x1c/0xd0 [<c0687462>] ? _spin_lock_irq+0x72/0x80 [<c0369442>] ? blk_get_request+0x32/0x70 [<c0471c1c>] ? generic_ide_resume+0x5c/0xf0 [<c03f8bde>] ? device_resume+0x32e/0x380 [<c0168791>] ? hibernation_snapshot+0xa1/0x220 [<c013b55b>] ? printk+0x1b/0x20 [<c01689f0>] ? hibernate+0xe0/0x180 [<c01674a0>] ? state_store+0x0/0xd0 [<c016755f>] ? state_store+0xbf/0xd0 [<c01674a0>] ? state_store+0x0/0xd0 [<c0375ef4>] ? kobj_attr_store+0x24/0x30 [<c01fa432>] ? sysfs_write_file+0xa2/0x100 [<c01bbf06>] ? vfs_write+0x96/0x130 [<c01fa390>] ? sysfs_write_file+0x0/0x100 [<c01bc44d>] ? sys_write+0x3d/0x70 [<c0104f3b>] ? sysenter_do_call+0x12/0x3f ======================= Code: c0 e8 d0 f7 da ff 8b 13 eb 97 8d b6 00 00 00 00 8d bf 00 00 00 00 55 89 e5 83 ec 24 89 5d f4 89 cb 89 75 f8 89 d6 89 7d fc 89 c7 <8b> 41 04 39 d0 75 1d 8b 06 39 d8 75 41 89 7b 04 89 1f 8b 5d f4 EIP: [<c038ad65>] __list_add+0x15/0x90 SS:ESP 0068:f25e7bf4 ---[ end trace e3ed674f2f20c5d3 ]--- note: bash[3597] exited with preempt_count 2 Eeek! page_mapcount(page) went negative! (-1) page pfn = 3e74f page->flags = 210007c page->count = 1 page->mapping = f6439028 vma->vm_ops = generic_file_vm_ops+0x0/0x20 vma->vm_ops->fault = filemap_fault+0x0/0x480 vma->vm_file->f_op->mmap = generic_file_mmap+0x0/0x50 ------------[ cut here ]------------ kernel BUG at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/mm/rmap.c:662! invalid opcode: 0000 [#2] PREEMPT SMP DEBUG_PAGEALLOC Pid: 3597, comm: bash Tainted: G D (2.6.27-rc4-00003-ga798564-dirty #30) EIP: 0060:[<c01a0f39>] EFLAGS: 00210286 CPU: 0 EIP is at page_remove_rmap+0x109/0x120 EAX: 0000003b EBX: f7aa5684 ECX: f25e6000 EDX: 00000005 ESI: f25b94c8 EDI: 3e74f025 EBP: f25e7990 ESP: f25e7980 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process bash (pid: 3597, ti=f25e6000 task=f25aa700 task.ti=f25e6000) Stack: c077ed7c f6439028 00000000 00000025 f25e7a38 c0198721 3e74f025 00000000 f25e79c0 c015c5ea 00000025 00000000 325a3067 00000000 00000000 f25b94c8 f25e7a50 00000000 00000000 3e74f025 00000000 00007000 0097a000 00000000 Call Trace: [<c0198721>] ? unmap_vmas+0x4b1/0x8b0 [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 [<c019d504>] ? exit_mmap+0x84/0x120 [<c0138538>] ? mmput+0x48/0xa0 [<c013c3d7>] ? exit_mm+0xe7/0x110 [<c013d7a4>] ? do_exit+0x184/0x890 [<c013b55b>] ? printk+0x1b/0x20 [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 [<c01060f1>] ? oops_end+0xb1/0xc0 [<c01067c0>] ? die+0x50/0x70 [<c0122b4f>] ? do_page_fault+0x1ef/0xa20 [<c010b335>] ? native_sched_clock+0xb5/0x110 [<c01600ea>] ? __lock_acquire+0x27a/0xa00 [<c0122960>] ? do_page_fault+0x0/0xa20 [<c0687d3a>] ? error_code+0x72/0x78 [<c038ad65>] ? __list_add+0x15/0x90 [<c0687133>] ? _spin_lock+0x63/0x70 [<c018b954>] ? rmqueue_bulk+0x54/0x80 [<c018d317>] ? get_page_from_freelist+0x5a7/0x720 [<c01600ea>] ? __lock_acquire+0x27a/0xa00 [<c018dd50>] ? __alloc_pages_internal+0xa0/0x450 [<c01acd4b>] ? alloc_pages_current+0x7b/0xc0 [<c01b37fb>] ? new_slab+0x1bb/0x2d0 [<c0687877>] ? _spin_unlock+0x27/0x50 [<c01b40ca>] ? __slab_alloc+0x32a/0x4e0 [<c010b335>] ? native_sched_clock+0xb5/0x110 [<c01b4424>] ? kmem_cache_alloc+0xb4/0xe0 [<c018969e>] ? mempool_alloc_slab+0xe/0x10 [<c018969e>] ? mempool_alloc_slab+0xe/0x10 [<c018969e>] ? mempool_alloc_slab+0xe/0x10 [<c01897a1>] ? mempool_alloc+0x31/0xf0 [<c015e884>] ? trace_hardirqs_on_caller+0xd4/0x160 [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 [<c0368c7e>] ? get_request+0xae/0x2c0 [<c036935c>] ? get_request_wait+0x1c/0xd0 [<c0687462>] ? _spin_lock_irq+0x72/0x80 [<c0369442>] ? blk_get_request+0x32/0x70 [<c0471c1c>] ? generic_ide_resume+0x5c/0xf0 [<c03f8bde>] ? device_resume+0x32e/0x380 [<c0168791>] ? hibernation_snapshot+0xa1/0x220 [<c013b55b>] ? printk+0x1b/0x20 [<c01689f0>] ? hibernate+0xe0/0x180 [<c01674a0>] ? state_store+0x0/0xd0 [<c016755f>] ? state_store+0xbf/0xd0 [<c01674a0>] ? state_store+0x0/0xd0 [<c0375ef4>] ? kobj_attr_store+0x24/0x30 [<c01fa432>] ? sysfs_write_file+0xa2/0x100 [<c01bbf06>] ? vfs_write+0x96/0x130 [<c01fa390>] ? sysfs_write_file+0x0/0x100 [<c01bc44d>] ? sys_write+0x3d/0x70 [<c0104f3b>] ? sysenter_do_call+0x12/0x3f ======================= Code: c0 74 0d 8b 50 08 b8 ac ed 77 c0 e8 82 60 fc ff 8b 46 4c 85 c0 74 14 8b 40 10 85 c0 74 0d 8b 50 2c b8 d8 d1 77 c0 e8 67 60 fc ff <0f> 0b eb fe 8b 53 0c eb 95 8d b4 26 00 00 00 00 8d bc 27 00 00 EIP: [<c01a0f39>] page_remove_rmap+0x109/0x120 SS:ESP 0068:f25e7980 ---[ end trace e3ed674f2f20c5d3 ]--- Fixing recursive fault but reboot is needed! ============================================================================= BUG blkdev_ioc: Invalid object pointer 0xf5cdaca8 ----------------------------------------------------------------------------- INFO: Slab 0xf789e318 objects=14 used=14 fp=0x00000000 flags=0x2082083 Pid: 3597, comm: bash Tainted: G D 2.6.27-rc4-00003-ga798564-dirty #30 [<c01b2576>] slab_err+0x46/0x50 [<c01b2766>] ? check_slab+0xd6/0xf0 [<c0181aef>] ? call_rcu+0x6f/0x80 [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 [<c01b3c78>] __slab_free+0x238/0x360 [<c01b4749>] kmem_cache_free+0xa9/0x120 [<c036b773>] ? put_io_context+0x53/0x70 [<c036b773>] ? put_io_context+0x53/0x70 [<c036b773>] put_io_context+0x53/0x70 [<c036b82e>] exit_io_context+0x6e/0x80 [<c013de6e>] do_exit+0x84e/0x890 [<c037b794>] ? trace_hardirqs_on_thunk+0xc/0x10 [<c013b55b>] ? printk+0x1b/0x20 [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 [<c01060f1>] oops_end+0xb1/0xc0 [<c01067c0>] die+0x50/0x70 [<c0106871>] do_trap+0x91/0xc0 [<c0106940>] ? do_invalid_op+0x0/0xa0 [<c01069c8>] do_invalid_op+0x88/0xa0 [<c01a0f39>] ? page_remove_rmap+0x109/0x120 [<c013b2d1>] ? vprintk+0x151/0x3c0 [<c013b45b>] ? vprintk+0x2db/0x3c0 [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 [<c0687d3a>] error_code+0x72/0x78 [<c013007b>] ? sched_rt_period_timer+0x21b/0x270 [<c01a0f39>] ? page_remove_rmap+0x109/0x120 [<c0198721>] unmap_vmas+0x4b1/0x8b0 [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 [<c019d504>] exit_mmap+0x84/0x120 [<c0138538>] mmput+0x48/0xa0 [<c013c3d7>] exit_mm+0xe7/0x110 [<c013d7a4>] do_exit+0x184/0x890 [<c013b55b>] ? printk+0x1b/0x20 [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 [<c01060f1>] oops_end+0xb1/0xc0 [<c01067c0>] die+0x50/0x70 [<c0122b4f>] do_page_fault+0x1ef/0xa20 [<c010b335>] ? native_sched_clock+0xb5/0x110 [<c01600ea>] ? __lock_acquire+0x27a/0xa00 [<c0122960>] ? do_page_fault+0x0/0xa20 [<c0687d3a>] error_code+0x72/0x78 [<c038ad65>] ? __list_add+0x15/0x90 [<c0687133>] ? _spin_lock+0x63/0x70 [<c018b954>] rmqueue_bulk+0x54/0x80 [<c018d317>] get_page_from_freelist+0x5a7/0x720 [<c01600ea>] ? __lock_acquire+0x27a/0xa00 [<c018dd50>] __alloc_pages_internal+0xa0/0x450 [<c01acd4b>] alloc_pages_current+0x7b/0xc0 [<c01b37fb>] new_slab+0x1bb/0x2d0 [<c0687877>] ? _spin_unlock+0x27/0x50 [<c01b40ca>] __slab_alloc+0x32a/0x4e0 [<c010b335>] ? native_sched_clock+0xb5/0x110 [<c01b4424>] kmem_cache_alloc+0xb4/0xe0 [<c018969e>] ? mempool_alloc_slab+0xe/0x10 [<c018969e>] ? mempool_alloc_slab+0xe/0x10 [<c018969e>] mempool_alloc_slab+0xe/0x10 [<c01897a1>] mempool_alloc+0x31/0xf0 [<c015e884>] ? trace_hardirqs_on_caller+0xd4/0x160 [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 [<c0368c7e>] get_request+0xae/0x2c0 [<c036935c>] get_request_wait+0x1c/0xd0 [<c0687462>] ? _spin_lock_irq+0x72/0x80 [<c0369442>] blk_get_request+0x32/0x70 [<c0471c1c>] generic_ide_resume+0x5c/0xf0 [<c03f8bde>] device_resume+0x32e/0x380 [<c0168791>] hibernation_snapshot+0xa1/0x220 [<c013b55b>] ? printk+0x1b/0x20 [<c01689f0>] hibernate+0xe0/0x180 [<c01674a0>] ? state_store+0x0/0xd0 [<c016755f>] state_store+0xbf/0xd0 [<c01674a0>] ? state_store+0x0/0xd0 [<c0375ef4>] kobj_attr_store+0x24/0x30 [<c01fa432>] sysfs_write_file+0xa2/0x100 [<c01bbf06>] vfs_write+0x96/0x130 [<c01fa390>] ? sysfs_write_file+0x0/0x100 [<c01bc44d>] sys_write+0x3d/0x70 [<c0104f3b>] sysenter_do_call+0x12/0x3f ======================= FIX blkdev_ioc: Object at 0xf5cdaca8 not freed BUG: scheduling while atomic: bash/3597/0x00000006 INFO: lockdep is turned off. Pid: 3597, comm: bash Tainted: G D 2.6.27-rc4-00003-ga798564-dirty #30 [<c0135467>] __schedule_bug+0x77/0x80 [<c0684ce2>] schedule+0x852/0x8f0 [<c010509e>] ? restore_nocheck_notrace+0x0/0xe [<c01b4779>] ? kmem_cache_free+0xd9/0x120 [<c036b773>] ? put_io_context+0x53/0x70 [<c036b773>] ? put_io_context+0x53/0x70 [<c013de81>] do_exit+0x861/0x890 [<c037b794>] ? trace_hardirqs_on_thunk+0xc/0x10 [<c013b55b>] ? printk+0x1b/0x20 [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 [<c01060f1>] oops_end+0xb1/0xc0 [<c01067c0>] die+0x50/0x70 [<c0106871>] do_trap+0x91/0xc0 [<c0106940>] ? do_invalid_op+0x0/0xa0 [<c01069c8>] do_invalid_op+0x88/0xa0 [<c01a0f39>] ? page_remove_rmap+0x109/0x120 [<c013b2d1>] ? vprintk+0x151/0x3c0 [<c013b45b>] ? vprintk+0x2db/0x3c0 [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 [<c0687d3a>] error_code+0x72/0x78 [<c013007b>] ? sched_rt_period_timer+0x21b/0x270 [<c01a0f39>] ? page_remove_rmap+0x109/0x120 [<c0198721>] unmap_vmas+0x4b1/0x8b0 [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 [<c019d504>] exit_mmap+0x84/0x120 [<c0138538>] mmput+0x48/0xa0 [<c013c3d7>] exit_mm+0xe7/0x110 [<c013d7a4>] do_exit+0x184/0x890 [<c013b55b>] ? printk+0x1b/0x20 [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 [<c01060f1>] oops_end+0xb1/0xc0 [<c01067c0>] die+0x50/0x70 [<c0122b4f>] do_page_fault+0x1ef/0xa20 [<c010b335>] ? native_sched_clock+0xb5/0x110 [<c01600ea>] ? __lock_acquire+0x27a/0xa00 [<c0122960>] ? do_page_fault+0x0/0xa20 [<c0687d3a>] error_code+0x72/0x78 [<c038ad65>] ? __list_add+0x15/0x90 [<c0687133>] ? _spin_lock+0x63/0x70 [<c018b954>] rmqueue_bulk+0x54/0x80 [<c018d317>] get_page_from_freelist+0x5a7/0x720 [<c01600ea>] ? __lock_acquire+0x27a/0xa00 [<c018dd50>] __alloc_pages_internal+0xa0/0x450 [<c01acd4b>] alloc_pages_current+0x7b/0xc0 [<c01b37fb>] new_slab+0x1bb/0x2d0 [<c0687877>] ? _spin_unlock+0x27/0x50 [<c01b40ca>] __slab_alloc+0x32a/0x4e0 [<c010b335>] ? native_sched_clock+0xb5/0x110 [<c01b4424>] kmem_cache_alloc+0xb4/0xe0 [<c018969e>] ? mempool_alloc_slab+0xe/0x10 [<c018969e>] ? mempool_alloc_slab+0xe/0x10 [<c018969e>] mempool_alloc_slab+0xe/0x10 [<c01897a1>] mempool_alloc+0x31/0xf0 [<c015e884>] ? trace_hardirqs_on_caller+0xd4/0x160 [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 [<c0368c7e>] get_request+0xae/0x2c0 [<c036935c>] get_request_wait+0x1c/0xd0 [<c0687462>] ? _spin_lock_irq+0x72/0x80 [<c0369442>] blk_get_request+0x32/0x70 [<c0471c1c>] generic_ide_resume+0x5c/0xf0 [<c03f8bde>] device_resume+0x32e/0x380 [<c0168791>] hibernation_snapshot+0xa1/0x220 [<c013b55b>] ? printk+0x1b/0x20 [<c01689f0>] hibernate+0xe0/0x180 [<c01674a0>] ? state_store+0x0/0xd0 [<c016755f>] state_store+0xbf/0xd0 [<c01674a0>] ? state_store+0x0/0xd0 [<c0375ef4>] kobj_attr_store+0x24/0x30 [<c01fa432>] sysfs_write_file+0xa2/0x100 I can look up addresses in the vmlinux for accurate line numbers if needed. Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-21 18:21 ` Vegard Nossum @ 2008-08-21 18:45 ` Vegard Nossum 2008-08-21 19:13 ` Rafael J. Wysocki 2008-08-21 21:27 ` Pekka Enberg 1 sibling, 1 reply; 21+ messages in thread From: Vegard Nossum @ 2008-08-21 18:45 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Linux Kernel Mailing List On Thu, Aug 21, 2008 at 8:21 PM, Vegard Nossum <vegard.nossum@gmail.com> wrote: > I got a "proper" oops this time (brace yourself, it's long ;-)), and > it does not have serial on the stack: > > BUG: unable to handle kernel paging request at 00100104 > IP: [<c038ad65>] __list_add+0x15/0x90 > *pdpt = 00000000325bb001 *pde = 0000000000000000 > Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > Pid: 3597, comm: bash Not tainted (2.6.27-rc4-00003-ga798564-dirty #30) > EIP: 0060:[<c038ad65>] EFLAGS: 00210082 CPU: 0 > EIP is at __list_add+0x15/0x90 This one is slightly different: WARNING: at lib/list_debug.c:26 __list_add+0x61/0x90() list_add corruption. next->prev should be prev (c088e34c), but was 00200200. (next=f7814774). Pid: 3572, comm: bash Not tainted 2.6.27-rc4-00003-ga798564-dirty #30 [<c013a56e>] warn_slowpath+0x5e/0x80 [<c015f7bf>] ? validate_chain+0x7df/0xe90 [<c014c4e8>] ? __kernel_text_address+0x8/0x20 [<c01600ea>] ? __lock_acquire+0x27a/0xa00 [<c015b60e>] ? get_lock_stats+0x1e/0x50 [<c015b64d>] ? put_lock_stats+0xd/0x30 [<c038adb1>] __list_add+0x61/0x90 [<c01b3208>] deactivate_slab+0x158/0x1b0 [<c01b3ea5>] __slab_alloc+0x105/0x4e0 [<c015b60e>] ? get_lock_stats+0x1e/0x50 [<c01b4424>] kmem_cache_alloc+0xb4/0xe0 [<c052a51b>] ? serio_queue_event+0x6b/0xe0 [<c052a51b>] ? serio_queue_event+0x6b/0xe0 [<c052a51b>] serio_queue_event+0x6b/0xe0 [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 [<c052a5b8>] serio_resume+0x28/0x30 [<c03f8bde>] device_resume+0x32e/0x380 [<c0168791>] hibernation_snapshot+0xa1/0x220 [<c013b55b>] ? printk+0x1b/0x20 [<c01689f0>] hibernate+0xe0/0x180 [<c01674a0>] ? state_store+0x0/0xd0 [<c016755f>] state_store+0xbf/0xd0 [<c01674a0>] ? state_store+0x0/0xd0 [<c0375ef4>] kobj_attr_store+0x24/0x30 [<c01fa432>] sysfs_write_file+0xa2/0x100 [<c01bbf06>] vfs_write+0x96/0x130 [<c01fa390>] ? sysfs_write_file+0x0/0x100 [<c01bc44d>] sys_write+0x3d/0x70 [<c0104f3b>] sysenter_do_call+0x12/0x3f ======================= ---[ end trace 0bc52e2837e4b8b8 ]--- ...followed by the "unable to handle kernel paging request" as in previous e-mail. The numbers 00100104 and 00200200 seem to appear all the time. At least it is 100% reproducible now and with nice oopses too :-) Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-21 18:45 ` Vegard Nossum @ 2008-08-21 19:13 ` Rafael J. Wysocki 2008-08-21 19:16 ` Vegard Nossum 0 siblings, 1 reply; 21+ messages in thread From: Rafael J. Wysocki @ 2008-08-21 19:13 UTC (permalink / raw) To: Vegard Nossum; +Cc: Linux Kernel Mailing List, Andrew Morton On Thursday, 21 of August 2008, Vegard Nossum wrote: > On Thu, Aug 21, 2008 at 8:21 PM, Vegard Nossum <vegard.nossum@gmail.com> wrote: > > I got a "proper" oops this time (brace yourself, it's long ;-)), and > > it does not have serial on the stack: > > > > BUG: unable to handle kernel paging request at 00100104 > > IP: [<c038ad65>] __list_add+0x15/0x90 > > *pdpt = 00000000325bb001 *pde = 0000000000000000 > > Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > > Pid: 3597, comm: bash Not tainted (2.6.27-rc4-00003-ga798564-dirty #30) > > EIP: 0060:[<c038ad65>] EFLAGS: 00210082 CPU: 0 > > EIP is at __list_add+0x15/0x90 > > This one is slightly different: > > WARNING: at lib/list_debug.c:26 __list_add+0x61/0x90() > list_add corruption. next->prev should be prev (c088e34c), but was > 00200200. (next=f7814774). > Pid: 3572, comm: bash Not tainted 2.6.27-rc4-00003-ga798564-dirty #30 > [<c013a56e>] warn_slowpath+0x5e/0x80 > [<c015f7bf>] ? validate_chain+0x7df/0xe90 > [<c014c4e8>] ? __kernel_text_address+0x8/0x20 > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > [<c015b60e>] ? get_lock_stats+0x1e/0x50 > [<c015b64d>] ? put_lock_stats+0xd/0x30 > [<c038adb1>] __list_add+0x61/0x90 > [<c01b3208>] deactivate_slab+0x158/0x1b0 > [<c01b3ea5>] __slab_alloc+0x105/0x4e0 > [<c015b60e>] ? get_lock_stats+0x1e/0x50 > [<c01b4424>] kmem_cache_alloc+0xb4/0xe0 > [<c052a51b>] ? serio_queue_event+0x6b/0xe0 > [<c052a51b>] ? serio_queue_event+0x6b/0xe0 > [<c052a51b>] serio_queue_event+0x6b/0xe0 > [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 > [<c052a5b8>] serio_resume+0x28/0x30 > [<c03f8bde>] device_resume+0x32e/0x380 > [<c0168791>] hibernation_snapshot+0xa1/0x220 > [<c013b55b>] ? printk+0x1b/0x20 > [<c01689f0>] hibernate+0xe0/0x180 > [<c01674a0>] ? state_store+0x0/0xd0 > [<c016755f>] state_store+0xbf/0xd0 > [<c01674a0>] ? state_store+0x0/0xd0 > [<c0375ef4>] kobj_attr_store+0x24/0x30 > [<c01fa432>] sysfs_write_file+0xa2/0x100 > [<c01bbf06>] vfs_write+0x96/0x130 > [<c01fa390>] ? sysfs_write_file+0x0/0x100 > [<c01bc44d>] sys_write+0x3d/0x70 > [<c0104f3b>] sysenter_do_call+0x12/0x3f > ======================= > ---[ end trace 0bc52e2837e4b8b8 ]--- > > ...followed by the "unable to handle kernel paging request" as in > previous e-mail. The numbers 00100104 and 00200200 seem to appear all > the time. > > At least it is 100% reproducible now and with nice oopses too :-) Hm, slab corruption, it seems. Do you use SLAB or SLUB? Rafael ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-21 19:13 ` Rafael J. Wysocki @ 2008-08-21 19:16 ` Vegard Nossum 2008-08-21 19:37 ` Rafael J. Wysocki 0 siblings, 1 reply; 21+ messages in thread From: Vegard Nossum @ 2008-08-21 19:16 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Linux Kernel Mailing List, Andrew Morton On Thu, Aug 21, 2008 at 9:13 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: >> ...followed by the "unable to handle kernel paging request" as in >> previous e-mail. The numbers 00100104 and 00200200 seem to appear all >> the time. >> >> At least it is 100% reproducible now and with nice oopses too :-) > > Hm, slab corruption, it seems. > > Do you use SLAB or SLUB? SLUB, with all possible debugging: CONFIG_SLUB_DEBUG=y # CONFIG_SLAB is not set CONFIG_SLUB=y # CONFIG_SLOB is not set CONFIG_SLABINFO=y CONFIG_SLUB_DEBUG_ON=y CONFIG_SLUB_STATS=y (Thanks for helping out.) Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-21 19:16 ` Vegard Nossum @ 2008-08-21 19:37 ` Rafael J. Wysocki 2008-08-21 20:08 ` Vegard Nossum 0 siblings, 1 reply; 21+ messages in thread From: Rafael J. Wysocki @ 2008-08-21 19:37 UTC (permalink / raw) To: Vegard Nossum; +Cc: Linux Kernel Mailing List, Andrew Morton On Thursday, 21 of August 2008, Vegard Nossum wrote: > On Thu, Aug 21, 2008 at 9:13 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > >> ...followed by the "unable to handle kernel paging request" as in > >> previous e-mail. The numbers 00100104 and 00200200 seem to appear all > >> the time. > >> > >> At least it is 100% reproducible now and with nice oopses too :-) > > > > Hm, slab corruption, it seems. > > > > Do you use SLAB or SLUB? > > SLUB, with all possible debugging: > > CONFIG_SLUB_DEBUG=y > # CONFIG_SLAB is not set > CONFIG_SLUB=y > # CONFIG_SLOB is not set > CONFIG_SLABINFO=y > CONFIG_SLUB_DEBUG_ON=y > CONFIG_SLUB_STATS=y > > (Thanks for helping out.) (You're welcome.) Can you switch to SLAB and retest? I don't really think SLUB is the issue here, but SLAB may give us additional information. Thanks, Rafael ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-21 19:37 ` Rafael J. Wysocki @ 2008-08-21 20:08 ` Vegard Nossum 2008-08-21 20:22 ` Rafael J. Wysocki 2008-08-21 21:21 ` Pekka Enberg 0 siblings, 2 replies; 21+ messages in thread From: Vegard Nossum @ 2008-08-21 20:08 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: Linux Kernel Mailing List, Andrew Morton On Thu, Aug 21, 2008 at 9:37 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > Can you switch to SLAB and retest? I don't really think SLUB is the issue > here, but SLAB may give us additional information. Got this with SLAB: BUG: unable to handle kernel NULL pointer dereference at 00000000 IP: [<c03897ac>] list_del+0xc/0x90 *pdpt = 0000000031480001 *pde = 0000000000000000 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Pid: 7, comm: events/0 Not tainted (2.6.27-rc4-00003-gef9b1bc #33) EIP: 0060:[<c03897ac>] EFLAGS: 00010082 CPU: 0 EIP is at list_del+0xc/0x90 EAX: 00000000 EBX: f556afa0 ECX: f54a5380 EDX: f54e3128 ESI: f54f1ef0 EDI: f556afa0 EBP: f68b7ee0 ESP: f68b7ec8 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process events/0 (pid: 7, ti=f68b6000 task=f689c338 task.ti=f68b6000) Stack: c015b62e f68b7ed4 c015b66d f68b7efc 00000046 f54f1f20 f68b7f1c c01b4054 00000000 f689c338 f54e315c 00000001 f54f1f20 f54a5380 00000000 f54e3128 f55b3348 00000000 f54f1f20 f54f1ef0 00000001 f68b7f3c c01b41ba 00000000 Call Trace: [<c015b62e>] ? get_lock_stats+0x1e/0x50 [<c015b66d>] ? put_lock_stats+0xd/0x30 [<c01b4054>] ? free_block+0xa4/0x1b0 [<c01b41ba>] ? drain_array+0x5a/0xb0 [<c01b4f82>] ? cache_reap+0x72/0x230 [<c014ada7>] ? run_workqueue+0x107/0x200 [<c014ae0a>] ? run_workqueue+0x16a/0x200 [<c014ada7>] ? run_workqueue+0x107/0x200 [<c01b4f10>] ? cache_reap+0x0/0x230 [<c014b94d>] ? worker_thread+0x7d/0xe0 [<c014e4f0>] ? autoremove_wake_function+0x0/0x50 [<c014b8d0>] ? worker_thread+0x0/0xe0 [<c014e1e2>] ? kthread+0x42/0x70 [<c014e1a0>] ? kthread+0x0/0x70 [<c0105cf3>] ? kernel_thread_helper+0x7/0x14 ======================= Code: e8 01 89 44 24 04 e8 d4 1d db ff 8b 55 04 b8 48 2f 7a c0 e8 47 d8 dd ff e8 c2 dc d7 ff eb a9 55 89 e5 53 89 c3 83 ec 14 8b 40 04 <8b> 00 39 d8 75 24 8b 13 8b 42 04 39 d8 75 41 8b 43 04 89 42 04 EIP: [<c03897ac>] list_del+0xc/0x90 SS:ESP 0068:f68b7ec8 ---[ end trace 958cea1a710a109a ]--- note: events/0[7] exited with preempt_count 1 uhci_hcd 0000:00:1d.2: setting latency timer to 64 uhci_hcd 0000:00:1d.3: setting latency timer to 64 ehci_hcd 0000:00:1d.7: setting latency timer to 64 pci 0000:00:1e.0: setting latency timer to 64 ------------[ cut here ]------------ WARNING: at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/lib/list_debug.c:26 __list_add+0x61/0x90() list_add corruption. next->prev should be prev (f6c00490), but was f73ecea8. (next=f73ecea8). Pid: 201, comm: rcu_torture_rea Tainted: G D 2.6.27-rc4-00003-gef9b1bc #33 [<c013a58e>] warn_slowpath+0x5e/0x80 [<c015b60b>] ? trace_hardirqs_off+0xb/0x10 [<c010b325>] ? native_sched_clock+0xb5/0x110 [<c0125c36>] ? kernel_map_pages+0xa6/0x130 [<c0389891>] __list_add+0x61/0x90 [<c018c978>] __free_pages_ok+0x368/0x410 [<c018ccf2>] __free_pages+0x22/0x40 [<c018cd58>] free_pages+0x48/0x50 [<c010be69>] free_thread_info+0x19/0x20 [<c0138929>] free_task+0x19/0x30 [<c0139e31>] __put_task_struct+0x51/0xa0 [<c013c0b7>] delayed_put_task_struct+0x27/0x30 [<c018206c>] rcu_process_callbacks+0x6c/0xb0 [<c013f5b3>] __do_softirq+0x83/0x100 [<c013f6d5>] do_softirq+0xa5/0xb0 [<c013f975>] irq_exit+0x95/0xa0 [<c010813d>] do_IRQ+0x4d/0xa0 [<c037a264>] ? trace_hardirqs_off_thunk+0xc/0x18 [<c0105a3c>] common_interrupt+0x28/0x30 [<c0683b44>] ? schedule+0x734/0x8f0 [<c068007b>] ? start_secondary+0x19b/0x1c0 [<c06800d8>] ? check_tsc_warp+0x38/0x1e0 [<c068695b>] ? _spin_unlock_irq+0x2b/0x60 [<c0683b44>] schedule+0x734/0x8f0 [<c010509e>] ? restore_nocheck_notrace+0x0/ [<c015e93b>] ? trace_hardirqs_on+0xb/0x10 [<c0180c88>] rcu_torture_reader+0x148/0x230 [<c0181460>] ? rcu_torture_timer+0x0/0x100 [<c0180d19>] ? rcu_torture_reader+0x1d9/0x230 [<c0180b40>] ? rcu_torture_reader+0x0/0x230 [<c014e1e2>] kthread+0x42/0x70 [<c014e1a0>] ? kthread+0x0/0x70 [<c0105cf3>] kernel_thread_helper+0x7/0x14 ======================= ---[ end trace 958cea1a710a109a ]--- eth0: link up, 100Mbps, full-duplex, lpa 0x45E1 eth1: link down hda: host max PIO4 wanted PIO255(auto-tune) selected PIO4 hda: UDMA/100 mode selected ------------[ cut here ]------------ kernel BUG at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/mm/slab.c:590! invalid opcode: 0000 [#2] PREEMPT SMP DEBUG_PAGEALLOC Pid: 3257, comm: S99local Tainted: G D W (2.6.27-rc4-00003-gef9b1bc #33) EIP: 0060:[<c01b24c0>] EFLAGS: 00210002 CPU: 0 EIP is at cache_free_debugcheck+0x2e0/0x310 EAX: 00000000 EBX: f4c1f688 ECX: c018976e EDX: f785f744 ESI: f6864180 EDI: f4c1f688 EBP: f25b1de8 ESP: f25b1d94 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Process S99local (pid: 3257, ti=f25b0000 task=f253f338 task.ti=f25b0000) Stack: 00200200 f4c1f688 00000000 f25b1e20 f25b1dac c0683402 f25b1e58 c018976e 00000001 c036a7b0 00000002 00000000 00000000 c088c190 f253f338 f4c1f688 f253f338 f25b1df8 f6864180 f6960ef0 f4c1f688 f25b1e0c c01b3c9c f253f7f4 Call Trace: [<c0683402>] ? wait_for_completion+0x12/0x20 [<c018976e>] ? mempool_free_slab+0xe/0x10 [<c036a7b0>] ? blk_end_sync_rq+0x0/0x30 [<c01b3c9c>] ? kmem_cache_free+0x5c/0x200 [<c018976e>] ? mempool_free_slab+0xe/0x10 [<c018979c>] ? mempool_free+0x2c/0x90 [<c0367252>] ? __blk_put_request+0x62/0x90 [<c036764c>] ? blk_put_request+0x2c/0x50 [<c0470732>] ? generic_ide_resume+0xa2/0xf0 [<c03f767e>] ? device_resume+0x32e/0x380 [<c01687d1>] ? hibernation_snapshot+0xa1/0x220 [<c013b57b>] ? printk+0x1b/0x20 [<c0168a30>] ? hibernate+0xe0/0x180 [<c01674e0>] ? state_store+0x0/0xd0 [<c016759f>] ? state_store+0xbf/0xd0 [<c01674e0>] ? state_store+0x0/0xd0 [<c03749d4>] ? kobj_attr_store+0x24/0x30 [<c01f8ef2>] ? sysfs_write_file+0xa2/0x100 [<c01ba946>] ? vfs_write+0x96/0x130 [<c01f8e50>] ? sysfs_write_file+0x0/0x100 [<c01bae8d>] ? sys_write+0x3d/0x70 [<c0104f3b>] ? sysenter_do_call+0x12/0x3f ======================= Code: 00 00 75 cd c7 03 21 43 65 87 8b 5e 28 e9 78 ff ff ff 0f 0b eb fe 90 8d 74 26 00 8b 52 0c e9 ca fd ff ff 0f 0b eb fe 8d 74 26 00 <0f> 0b eb fe 0f 0b eb fe 0f 0b eb fe 8d 74 26 00 8b 52 0c 8b 02 EIP: [<c01b24c0>] cache_free_debugcheck+0x2e0/0x310 SS:ESP 0068:f25b1d94 ---[ end trace 958cea1a710a109a ]--- note: S99local[3257] exited with preempt_count 1 BUG: sleeping function called from invalid context at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/kernel/rwsem.c:21 in_atomic():1, irqs_disabled():0 INFO: lockdep is turned off. Pid: 3257, comm: S99local Tainted: G D W 2.6.27-rc4-00003-gef9b1bc #33 [<c01312c1>] __might_sleep+0xf1/0x140 [<c0684fe9>] down_read+0x19/0x80 [<c03e007b>] ? tty_audit_add_data+0x1bb/0x2e0 [<c013c33b>] exit_mm+0x2b/0x110 [<c013d7c4>] do_exit+0x184/0x890 [<c013b57b>] ? printk+0x1b/0x20 [<c013a52a>] ? print_oops_end_marker+0x2a/0x30 [<c01060f1>] oops_end+0xb1/0xc0 [<c01067c0>] die+0x50/0x70 [<c0106871>] do_trap+0x91/0xc0 BUG: unable to handle kernel paging request at 00100104 IP: [<c0389845>] __list_add+0x15/0x90 *pdpt = 0000000031480001 *pde = 0000000000000000 Oops: 0000 [#3] PREEMPT SMP DEBUG_PAGEALLOC Pid: 3257, comm: S99local Tainted: G D W (2.6.27-rc4-00003-gef9b1bc #33) EIP: 0060:[<c0389845>] EFLAGS: 00210086 CPU: 0 EIP is at __list_add+0x15/0x90 EAX: f73ece6c EBX: 00100100 ECX: 00100100 EDX: f73ece6c ESI: f73ece6c EDI: f73ece6c EBP: f25b16dc ESP: f25b16b8 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process S99local (pid: 3257, ti=f25b0000 task=f253f338 task.ti=f25b0000) Stack: f6814e44 f6c00400 c06860b3 00000000 00000002 00000000 f73ece3c f73ece6c f73ece6c f25b1704 c018b9e4 0000001f 00000000 f6c00400 f6c00444 00000001 f6814e44 f6814e38 00200002 f25b1778 c018d3a7 f6814e44 00000000 00000000 Call Trace: [<c06860b3>] ? _spin_lock+0x63/0x70 [<c018b9e4>] ? rmqueue_bulk+0x54/0x80 [<c018d3a7>] ? get_page_from_freelist+0x5a7/0x720 [<c018dde0>] ? __alloc_pages_internal+0xa0/0x450 [<c01b25a2>] ? kmem_getpages+0x62/0x110 [<c01b2f4c>] ? cache_grow+0x39c/0x3b0 [<c03890b6>] ? _raw_spin_unlock+0x46/0x80 [<c01b3503>] ? cache_alloc_refill+0x1f3/0x230 [<c01b36e5>] ? __kmalloc+0x1a5/0x1e0 [<c03cd9b2>] ? tty_buffer_request_room+0xe2/0x130 [<c03cd9b2>] ? tty_buffer_request_room+0xe2/0x130 [<c03cfe0d>] ? tty_insert_flip_string_flags+0x2d/0xa0 [<c03ee7b1>] ? receive_chars+0x161/0x290 [<c03ef7e4>] ? serial8250_interrupt+0x134/0x150 [<c017dc08>] ? handle_IRQ_event+0x28/0x70 [<c017f1cf>] ? handle_edge_irq+0xaf/0x140 [<c0108138>] ? do_IRQ+0x48/0xa0 [<c037a264>] ? trace_hardirqs_off_thunk+0xc/0x18 [<c0105a3c>] ? common_interrupt+0x28/0x30 [<c013b2f1>] ? vprintk+0x151/0x3c0 [<c0143e5f>] ? run_timer_softirq+0x19f/0x1d0 [<c0143e5f>] ? run_timer_softirq+0x19f/0x1d0 [<c013b57b>] ? printk+0x1b/0x20 [<c016700a>] ? __print_symbol+0x2a/0x40 [<c037a254>] ? trace_hardirqs_on_thunk+0xc/0x10 [<c010509e>] ? restore_nocheck_notrace+0x0/0xe [<c013b2f1>] ? vprintk+0x151/0x3c0 [<c013b47b>] ? vprintk+0x2db/0x3c0 [<c013f91a>] ? irq_exit+0x3a/0xa0 [<c010509e>] ? restore_nocheck_notrace+0x0/0xe [<c013b2f1>] ? vprintk+0x151/0x3c0 [<c0106871>] ? do_trap+0x91/0xc0 [<c013b57b>] ? printk+0x1b/0x20 [<c0106871>] ? do_trap+0x91/0xc0 [<c01061f0>] ? print_trace_address+0x40/0x50 [<c0106871>] ? do_trap+0x91/0xc0 [<c01062aa>] ? dump_trace+0xaa/0x120 [<c0106346>] ? show_trace_log_lvl+0x26/0x40 [<c0106b8a>] ? show_trace+0x1a/0x20 [<c01074d2>] ? dump_stack+0x72/0x80 [<c01312c1>] ? __might_sleep+0xf1/0x140 [<c0684fe9>] ? down_read+0x19/0x80 [<c03e007b>] ? tty_audit_add_data+0x1bb/0x2e0 [<c013c33b>] ? exit_mm+0x2b/0x110 [<c013d7c4>] ? do_exit+0x184/0x890 [<c013b57b>] ? printk+0x1b/0x20 [<c013a52a>] ? print_oops_end_marker+0x2a/0x30 [<c01060f1>] ? oops_end+0xb1/0xc0 [<c01067c0>] ? die+0x50/0x70 [<c0106871>] ? do_trap+0x91/0xc0 [<c0106940>] ? do_invalid_op+0x0/0xa0 [<c01069c8>] ? do_invalid_op+0x88/0xa0 [<c01b24c0>] ? cache_free_debugcheck+0x2e0/0x310 [<c015c60a>] ? print_lock_contention_bug+0x1a/0xe0 [<c0182527>] ? rcu_irq_exit+0x17/0x90 [<c0686cba>] ? error_code+0x72/0x78 [<c018976e>] ? mempool_free_slab+0xe/0x10 [<c068007b>] ? start_secondary+0x19b/0x1c0 [<c06800d8>] ? check_tsc_warp+0x38/0x1e0 [<c01b24c0>] ? cache_free_debugcheck+0x2e0/0x310 [<c0683402>] ? wait_for_completion+0x12/0x20 [<c018976e>] ? mempool_free_slab+0xe/0x10 [<c036a7b0>] ? blk_end_sync_rq+0x0/0x30 [<c01b3c9c>] ? kmem_cache_free+0x5c/0x200 [<c018976e>] ? mempool_free_slab+0xe/0x10 [<c018979c>] ? mempool_free+0x2c/0x90 [<c0367252>] ? __blk_put_request+0x62/0x90 [<c036764c>] ? blk_put_request+0x2c/0x50 [<c0470732>] ? generic_ide_resume+0xa2/0xf0 [<c03f767e>] ? device_resume+0x32e/0x380 [<c01687d1>] ? hibernation_snapshot+0xa1/0x220 [<c013b57b>] ? printk+0x1b/0x20 [<c0168a30>] ? hibernate+0xe0/0x180 [<c01674e0>] ? state_store+0x0/0xd0 [<c016759f>] ? state_store+0xbf/0xd0 [<c01674e0>] ? state_store+0x0/0xd0 [<c03749d4>] ? kobj_attr_store+0x24/0x30 [<c01f8ef2>] ? sysfs_write_file+0xa2/0x100 [<c01ba946>] ? vfs_write+0x96/0x130 [<c01f8e50>] ? sysfs_write_file+0x0/0x100 [<c01bae8d>] ? sys_write+0x3d/0x70 [<c0104f3b>] ? sysenter_do_call+0x12/0x3f ======================= Code: c0 e8 10 0d db ff 8b 13 eb 97 8d b6 00 00 00 00 8d bf 00 00 00 00 55 89 e5 83 ec 24 89 5d f4 89 cb 89 75 f8 89 d6 89 7d fc 89 c7 <8b> 41 04 39 d0 75 1d 8b 06 39 d8 75 41 89 7b 04 89 1f 8b 5d f4 EIP: [<c0389845>] __list_add+0x15/0x90 SS:ESP 0068:f25b16b8 Kernel panic - not syncing: Fatal exception in interrupt ------------[ cut here ]------------ WARNING: at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/kernel/smp.c:328 smp_call_function_mask+0x1b1/0x1c0() Pid: 3257, comm: S99local Tainted: G D W 2.6.27-rc4-00f9b1bc #33 [<c013a97f>] warn_on_slowpath+0x4f/0x80 [<c0397ff4>] ? vgacon_set_cursor_size+0xa4/0x120 [<c015b60b>] ? trace_hardirqs_off+0xb/0x10 [<c018214b>] ? __rcu_read_unlock+0x9b/0xc0 [<c015332c>] ? __atomic_notifier_call_chain+0x3c/0x50 [<c03db7fd>] ? vt_console_print+0x1dd/0x2a0 [<c015335a>] ? atomic_notifier_call_chain+0x1a/0x20 [<c03890b6>] ? _raw_spin_unlock+0x46/0x80 [<c068680c>] ? _spin_unlock+0x3c/0x50 [<c03db7fd>] ? vt_console_print+0x1dd/0x2a0 [<c015b60b>] ? trace_hardirqs_off+0xb/0x10 [<c013af61>] ? release_console_sem+0x1c1/0x1f0 [<c015b60b>] ? trace_hardirqs_off+0xb/0x10 [<c0165f81>] smp_call_function_mask+0x1b1/0x1c0 [<c0119280>] ? stop_this_cpu+0x0/0x50 [<c0684808>] ? mutex_unlock+0x8/0x10 [<c015b60b>] ? trace_hardirqs_off+0xb/0x10 [<c06847f1>] ? __mutex_unlock_slowpath+0x151/0x160 [<c0684808>] ? mutex_unlock+0x8/0x10 [<c016c46d>] ? crash_kexec+0x6d/0xc0 [<c0684808>] ? mutex_unlock+0x8/0x10 [<c016c46d>] ? crash_kexec+0x6d/0xc0 [<c0119280>] ? stop_this_cpu+0x0/0x50 [<c0165fc0>] smp_call_function+0x30/0x60 [<c011937e>] native_smp_send_stop+0x1e/0x70 [<c013a879>] panic+0x69/0x120 [<c01060e5>] oops_end+0xa5/0xc0 [<c01067c0>] die+0x50/0x70 [<c0122b4f>] do_page_fault+0x1ef/0xa20 [<c0122960>] ? do_page_fault+0x0/0xa20 [<c0686cba>] error_code+0x72/0x78 [<c0389845>] ? __list_add+0x15/0x90 [<c06860b3>] ? _spin_lock+0x63/0x70 [<c018b9e4>] rmqueue_bulk+0x54/0x80 [<c018d3a7>] get_page_from_freelist+0x5a7/0x720 [<c018dde0>] __alloc_pages_internal+0xa0/0x450 [<c01b25a2>] kmem_getpages+0x62/0x110 [<c01b2f4c>] cache_grow+0x39c/0x3b0 [<c03890b6>] ? _raw_spin_unlock+0x46/0x80 [<c01b3503>] cache_alloc_refill+0x1f3/0x230 [<c01b36e5>] __kmalloc+0x1a5/0x1e0 [<c03cd9b2>] ? tty_buffer_request_room+0xe2/0x130 [<c03cd9b2>] tty_buffer_request_room+0xe2/0x130 [<c03cfe0d>] tty_insert_flip_string_flags+0x2d/0xa0 [<c03ee7b1>] receive_chars+0x161/0x290 [<c03ef7e4>] serial8250_interrupt+0x134/0x150 [<c017dc08>] handle_IRQ_event+0x28/0x70 [<c017f1cf>] handle_edge_irq+0xaf/0x140 [<c0108138>] do_IRQ+0x48/0xa0 [<c037a264>] ? trace_hardirqs_off_thunk+0xc/0x18 [<c0105a3c>] common_interrupt+0x28/0x30 [<c013b2f1>] ? vprintk+0x151/0x3c0 [<c0143e5f>] ? run_timer_softirq+0x19f/0x1d0 [<c0143e5f>] ? run_timer_softirq+0x19f/0x1d0 [<c013b57b>] printk+0x1b/0x20 [<c016700a>] __print_symbol+0x2a/0x40 [<c037a254>] ? trace_hardirqs_on_thunk+0xc/0x10 [<c010509e>] ? restore_nocheck_notrace+0x0/0xe [<c013b2f1>] ? vprintk+0x151/0x3c0 [<c013b47b>] ? vprintk+0x2db/0x3c0 [<c013f91a>] ? irq_exit+0x3a/0xa0 [<c010509e>] ? restore_nocheck_notrace+0x0/0xe [<c013b2f1>] ? vprintk+0x151/0x3c0 [<c0106871>] ? do_trap+0x91/0xc0 [<c013b57b>] ? printk+0x1b/0x20 [<c0106871>] ? do_trap+0x91/0xc0 [<c01061f0>] print_trace_address+0x40/0x50 [<c0106871>] ? do_trap+0x91/0xc0 [<c01062aa>] dump_trace+0xaa/0x120 [<c0106346>] show_trace_log_lvl+0x26/0x40 [<c0106b8a>] show_trace+0x1a/0x20 [<c01074d2>] dump_stack+0x72/0x80 [<c01312c1>] __might_sleep+0xf1/0x140 [<c0684fe9>] down_read+0x19/0x80 [<c03e007b>] ? tty_audit_add_data+0x1bb/0x2e0 [<c013c33b>] exit_mm+0x2b/0x110 [<c013d7c4>] do_exit+0x184/0x890 [<c013b57b>] ? printk+0x1b/0x20 [<c013a52a>] ? print_oops_end_marker+0x2a/0x30 [<c01060f1>] oops_end+0xb1/0xc0 [<c01067c0>] die+0x50/0x70 [<c0106871>] do_trap+0x91/0xc0 [<c0106940>] ? do_invalid_op+0x0/0xa0 [<c01069c8>] do_invalid_op+0x88/0xa0 [<c01b24c0>] ? cache_free_debugcheck+0x2e0/0x310 [<c015c60a>] ? print_lock_contention_bug+0x1a/0xe0 [<c0182527>] ? rcu_irq_exit+0x17/0x90 [<c0686cba>] error_code+0x72/0x78 [<c018976e>] ? mempool_free_slab+0xe/0x10 [<c068007b>] ? start_secondary+0x19b/0x1c0 [<c06800d8>] ? check_tsc_warp+0x38/0x1e0 [<c01b24c0>] ? cache_free_debugcheck+0x2e0/0x310 [<c0683402>] ? wait_for_completion+0x12/0x20 [<c018976e>] ? mempool_free_slab+0xe/0x10 [<c036a7b0>] ? blk_end_sync_rq+0x0/0x30 [<c01b3c9c>] kmem_cache_free+0x5c/0x200 [<c018976e>] mempool_free_slab+0xe/0x10 [<c018979c>] mempool_free+0x2c/0x90 [<c0367252>] __blk_put_request+0x62/0x90 [<c036764c>] blk_put_request+0x2c/0x50 [<c0470732>] generic_ide_resume+0xa2/0xf0 [<c03f767e>] device_resume+0x32e/0x380 [<c01687d1>] hibernation_snapshot+0xa1/0x220 [<c013b57b>] ? printk+0x1b/0x20 [<c0168a30>] hibernate+0xe0/0x180 [<c01674e0>] ? state_store+0x0/0xd0 [<c016759f>] state_store+0xbf/0xd0 [<c01674e0>] ? state_store+0x0/0xd0 [<c03749d4>] kobj_attr_store+0x24/0x30 [<c01f8ef2>] sysfs_write_file+0xa2/0x100 [<c01ba946>] vfs_write+0x96/0x130 [<c01f8e50>] ? sysfs_write_file+0x0/0x100 [<c01bae8d>] sys_write+0x3d/0x70 [<c0104f3b>] sysenter_do_call+0x12/0x3f ======================= ---[ end trace 958cea1a710a109a ]--- Rebooting in 10 seconds..Initializing cgroup subsys cpuset Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-21 20:08 ` Vegard Nossum @ 2008-08-21 20:22 ` Rafael J. Wysocki 2008-08-21 20:26 ` Andrew Morton 2008-08-21 21:21 ` Pekka Enberg 1 sibling, 1 reply; 21+ messages in thread From: Rafael J. Wysocki @ 2008-08-21 20:22 UTC (permalink / raw) To: Vegard Nossum, Andrew Morton; +Cc: Linux Kernel Mailing List On Thursday, 21 of August 2008, Vegard Nossum wrote: > On Thu, Aug 21, 2008 at 9:37 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > Can you switch to SLAB and retest? I don't really think SLUB is the issue > > here, but SLAB may give us additional information. > > Got this with SLAB: Well, it looks like this happened while devices were suspended, but I can't say what's going wrong. Andrew? Who's the right person to CC? > BUG: unable to handle kernel NULL pointer dereference at 00000000 > IP: [<c03897ac>] list_del+0xc/0x90 > *pdpt = 0000000031480001 *pde = 0000000000000000 > Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > Pid: 7, comm: events/0 Not tainted (2.6.27-rc4-00003-gef9b1bc #33) > EIP: 0060:[<c03897ac>] EFLAGS: 00010082 CPU: 0 > EIP is at list_del+0xc/0x90 > EAX: 00000000 EBX: f556afa0 ECX: f54a5380 EDX: f54e3128 > ESI: f54f1ef0 EDI: f556afa0 EBP: f68b7ee0 ESP: f68b7ec8 > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 > Process events/0 (pid: 7, ti=f68b6000 task=f689c338 task.ti=f68b6000) > Stack: c015b62e f68b7ed4 c015b66d f68b7efc 00000046 f54f1f20 f68b7f1c c01b4054 > 00000000 f689c338 f54e315c 00000001 f54f1f20 f54a5380 00000000 f54e3128 > f55b3348 00000000 f54f1f20 f54f1ef0 00000001 f68b7f3c c01b41ba 00000000 > Call Trace: > [<c015b62e>] ? get_lock_stats+0x1e/0x50 > [<c015b66d>] ? put_lock_stats+0xd/0x30 > [<c01b4054>] ? free_block+0xa4/0x1b0 > [<c01b41ba>] ? drain_array+0x5a/0xb0 > [<c01b4f82>] ? cache_reap+0x72/0x230 > [<c014ada7>] ? run_workqueue+0x107/0x200 > [<c014ae0a>] ? run_workqueue+0x16a/0x200 > [<c014ada7>] ? run_workqueue+0x107/0x200 > [<c01b4f10>] ? cache_reap+0x0/0x230 > [<c014b94d>] ? worker_thread+0x7d/0xe0 > [<c014e4f0>] ? autoremove_wake_function+0x0/0x50 > [<c014b8d0>] ? worker_thread+0x0/0xe0 > [<c014e1e2>] ? kthread+0x42/0x70 > [<c014e1a0>] ? kthread+0x0/0x70 > [<c0105cf3>] ? kernel_thread_helper+0x7/0x14 > ======================= > Code: e8 01 89 44 24 04 e8 d4 1d db ff 8b 55 04 b8 48 2f 7a c0 e8 47 d8 dd ff e8 > c2 dc d7 ff eb a9 55 89 e5 53 89 c3 83 ec 14 8b 40 04 <8b> 00 39 d8 75 24 8b 13 > 8b 42 04 39 d8 75 41 8b 43 04 89 42 04 > EIP: [<c03897ac>] list_del+0xc/0x90 SS:ESP 0068:f68b7ec8 > ---[ end trace 958cea1a710a109a ]--- > note: events/0[7] exited with preempt_count 1 > uhci_hcd 0000:00:1d.2: setting latency timer to 64 > uhci_hcd 0000:00:1d.3: setting latency timer to 64 > ehci_hcd 0000:00:1d.7: setting latency timer to 64 > pci 0000:00:1e.0: setting latency timer to 64 > ------------[ cut here ]------------ > WARNING: at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/lib/list_debug.c:26 > __list_add+0x61/0x90() > list_add corruption. next->prev should be prev (f6c00490), but was > f73ecea8. (next=f73ecea8). > Pid: 201, comm: rcu_torture_rea Tainted: G D > 2.6.27-rc4-00003-gef9b1bc #33 > [<c013a58e>] warn_slowpath+0x5e/0x80 > [<c015b60b>] ? trace_hardirqs_off+0xb/0x10 > [<c010b325>] ? native_sched_clock+0xb5/0x110 > [<c0125c36>] ? kernel_map_pages+0xa6/0x130 > [<c0389891>] __list_add+0x61/0x90 > [<c018c978>] __free_pages_ok+0x368/0x410 > [<c018ccf2>] __free_pages+0x22/0x40 > [<c018cd58>] free_pages+0x48/0x50 > [<c010be69>] free_thread_info+0x19/0x20 > [<c0138929>] free_task+0x19/0x30 > [<c0139e31>] __put_task_struct+0x51/0xa0 > [<c013c0b7>] delayed_put_task_struct+0x27/0x30 > [<c018206c>] rcu_process_callbacks+0x6c/0xb0 > [<c013f5b3>] __do_softirq+0x83/0x100 > [<c013f6d5>] do_softirq+0xa5/0xb0 > [<c013f975>] irq_exit+0x95/0xa0 > [<c010813d>] do_IRQ+0x4d/0xa0 > [<c037a264>] ? trace_hardirqs_off_thunk+0xc/0x18 > [<c0105a3c>] common_interrupt+0x28/0x30 > [<c0683b44>] ? schedule+0x734/0x8f0 > [<c068007b>] ? start_secondary+0x19b/0x1c0 > [<c06800d8>] ? check_tsc_warp+0x38/0x1e0 > [<c068695b>] ? _spin_unlock_irq+0x2b/0x60 > [<c0683b44>] schedule+0x734/0x8f0 > [<c010509e>] ? restore_nocheck_notrace+0x0/ > [<c015e93b>] ? trace_hardirqs_on+0xb/0x10 > [<c0180c88>] rcu_torture_reader+0x148/0x230 > [<c0181460>] ? rcu_torture_timer+0x0/0x100 > [<c0180d19>] ? rcu_torture_reader+0x1d9/0x230 > [<c0180b40>] ? rcu_torture_reader+0x0/0x230 > [<c014e1e2>] kthread+0x42/0x70 > [<c014e1a0>] ? kthread+0x0/0x70 > [<c0105cf3>] kernel_thread_helper+0x7/0x14 > ======================= > ---[ end trace 958cea1a710a109a ]--- > eth0: link up, 100Mbps, full-duplex, lpa 0x45E1 > eth1: link down > hda: host max PIO4 wanted PIO255(auto-tune) selected PIO4 > hda: UDMA/100 mode selected > ------------[ cut here ]------------ > kernel BUG at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/mm/slab.c:590! > invalid opcode: 0000 [#2] PREEMPT SMP DEBUG_PAGEALLOC > Pid: 3257, comm: S99local Tainted: G D W (2.6.27-rc4-00003-gef9b1bc #33) > EIP: 0060:[<c01b24c0>] EFLAGS: 00210002 CPU: 0 > EIP is at cache_free_debugcheck+0x2e0/0x310 > EAX: 00000000 EBX: f4c1f688 ECX: c018976e EDX: f785f744 > ESI: f6864180 EDI: f4c1f688 EBP: f25b1de8 ESP: f25b1d94 > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 > Process S99local (pid: 3257, ti=f25b0000 task=f253f338 task.ti=f25b0000) > Stack: 00200200 f4c1f688 00000000 f25b1e20 f25b1dac c0683402 f25b1e58 c018976e > 00000001 c036a7b0 00000002 00000000 00000000 c088c190 f253f338 f4c1f688 > f253f338 f25b1df8 f6864180 f6960ef0 f4c1f688 f25b1e0c c01b3c9c f253f7f4 > Call Trace: > [<c0683402>] ? wait_for_completion+0x12/0x20 > [<c018976e>] ? mempool_free_slab+0xe/0x10 > [<c036a7b0>] ? blk_end_sync_rq+0x0/0x30 > [<c01b3c9c>] ? kmem_cache_free+0x5c/0x200 > [<c018976e>] ? mempool_free_slab+0xe/0x10 > [<c018979c>] ? mempool_free+0x2c/0x90 > [<c0367252>] ? __blk_put_request+0x62/0x90 > [<c036764c>] ? blk_put_request+0x2c/0x50 > [<c0470732>] ? generic_ide_resume+0xa2/0xf0 > [<c03f767e>] ? device_resume+0x32e/0x380 > [<c01687d1>] ? hibernation_snapshot+0xa1/0x220 > [<c013b57b>] ? printk+0x1b/0x20 > [<c0168a30>] ? hibernate+0xe0/0x180 > [<c01674e0>] ? state_store+0x0/0xd0 > [<c016759f>] ? state_store+0xbf/0xd0 > [<c01674e0>] ? state_store+0x0/0xd0 > [<c03749d4>] ? kobj_attr_store+0x24/0x30 > [<c01f8ef2>] ? sysfs_write_file+0xa2/0x100 > [<c01ba946>] ? vfs_write+0x96/0x130 > [<c01f8e50>] ? sysfs_write_file+0x0/0x100 > [<c01bae8d>] ? sys_write+0x3d/0x70 > [<c0104f3b>] ? sysenter_do_call+0x12/0x3f > ======================= > Code: 00 00 75 cd c7 03 21 43 65 87 8b 5e 28 e9 78 ff ff ff 0f 0b eb fe 90 8d 74 > 26 00 8b 52 0c e9 ca fd ff ff 0f 0b eb fe 8d 74 26 00 <0f> 0b eb fe 0f 0b eb fe > 0f 0b eb fe 8d 74 26 00 8b 52 0c 8b 02 > EIP: [<c01b24c0>] cache_free_debugcheck+0x2e0/0x310 SS:ESP 0068:f25b1d94 > ---[ end trace 958cea1a710a109a ]--- > note: S99local[3257] exited with preempt_count 1 > BUG: sleeping function called from invalid context at > /uio/arkimedes/s29/vegardno/git-working/linux-2.6/kernel/rwsem.c:21 > in_atomic():1, irqs_disabled():0 > INFO: lockdep is turned off. > Pid: 3257, comm: S99local Tainted: G D W 2.6.27-rc4-00003-gef9b1bc #33 > [<c01312c1>] __might_sleep+0xf1/0x140 > [<c0684fe9>] down_read+0x19/0x80 > [<c03e007b>] ? tty_audit_add_data+0x1bb/0x2e0 > [<c013c33b>] exit_mm+0x2b/0x110 > [<c013d7c4>] do_exit+0x184/0x890 > [<c013b57b>] ? printk+0x1b/0x20 > [<c013a52a>] ? print_oops_end_marker+0x2a/0x30 > [<c01060f1>] oops_end+0xb1/0xc0 > [<c01067c0>] die+0x50/0x70 > [<c0106871>] do_trap+0x91/0xc0 > BUG: unable to handle kernel paging request at 00100104 > IP: [<c0389845>] __list_add+0x15/0x90 > *pdpt = 0000000031480001 *pde = 0000000000000000 > Oops: 0000 [#3] PREEMPT SMP DEBUG_PAGEALLOC > Pid: 3257, comm: S99local Tainted: G D W (2.6.27-rc4-00003-gef9b1bc #33) > EIP: 0060:[<c0389845>] EFLAGS: 00210086 CPU: 0 > EIP is at __list_add+0x15/0x90 > EAX: f73ece6c EBX: 00100100 ECX: 00100100 EDX: f73ece6c > ESI: f73ece6c EDI: f73ece6c EBP: f25b16dc ESP: f25b16b8 > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 > Process S99local (pid: 3257, ti=f25b0000 task=f253f338 task.ti=f25b0000) > Stack: f6814e44 f6c00400 c06860b3 00000000 00000002 00000000 f73ece3c f73ece6c > f73ece6c f25b1704 c018b9e4 0000001f 00000000 f6c00400 f6c00444 00000001 > f6814e44 f6814e38 00200002 f25b1778 c018d3a7 f6814e44 00000000 00000000 > Call Trace: > [<c06860b3>] ? _spin_lock+0x63/0x70 > [<c018b9e4>] ? rmqueue_bulk+0x54/0x80 > [<c018d3a7>] ? get_page_from_freelist+0x5a7/0x720 > [<c018dde0>] ? __alloc_pages_internal+0xa0/0x450 > [<c01b25a2>] ? kmem_getpages+0x62/0x110 > [<c01b2f4c>] ? cache_grow+0x39c/0x3b0 > [<c03890b6>] ? _raw_spin_unlock+0x46/0x80 > [<c01b3503>] ? cache_alloc_refill+0x1f3/0x230 > [<c01b36e5>] ? __kmalloc+0x1a5/0x1e0 > [<c03cd9b2>] ? tty_buffer_request_room+0xe2/0x130 > [<c03cd9b2>] ? tty_buffer_request_room+0xe2/0x130 > [<c03cfe0d>] ? tty_insert_flip_string_flags+0x2d/0xa0 > [<c03ee7b1>] ? receive_chars+0x161/0x290 > [<c03ef7e4>] ? serial8250_interrupt+0x134/0x150 > [<c017dc08>] ? handle_IRQ_event+0x28/0x70 > [<c017f1cf>] ? handle_edge_irq+0xaf/0x140 > [<c0108138>] ? do_IRQ+0x48/0xa0 > [<c037a264>] ? trace_hardirqs_off_thunk+0xc/0x18 > [<c0105a3c>] ? common_interrupt+0x28/0x30 > [<c013b2f1>] ? vprintk+0x151/0x3c0 > [<c0143e5f>] ? run_timer_softirq+0x19f/0x1d0 > [<c0143e5f>] ? run_timer_softirq+0x19f/0x1d0 > [<c013b57b>] ? printk+0x1b/0x20 > [<c016700a>] ? __print_symbol+0x2a/0x40 > [<c037a254>] ? trace_hardirqs_on_thunk+0xc/0x10 > [<c010509e>] ? restore_nocheck_notrace+0x0/0xe > [<c013b2f1>] ? vprintk+0x151/0x3c0 > [<c013b47b>] ? vprintk+0x2db/0x3c0 > [<c013f91a>] ? irq_exit+0x3a/0xa0 > [<c010509e>] ? restore_nocheck_notrace+0x0/0xe > [<c013b2f1>] ? vprintk+0x151/0x3c0 > [<c0106871>] ? do_trap+0x91/0xc0 > [<c013b57b>] ? printk+0x1b/0x20 > [<c0106871>] ? do_trap+0x91/0xc0 > [<c01061f0>] ? print_trace_address+0x40/0x50 > [<c0106871>] ? do_trap+0x91/0xc0 > [<c01062aa>] ? dump_trace+0xaa/0x120 > [<c0106346>] ? show_trace_log_lvl+0x26/0x40 > [<c0106b8a>] ? show_trace+0x1a/0x20 > [<c01074d2>] ? dump_stack+0x72/0x80 > [<c01312c1>] ? __might_sleep+0xf1/0x140 > [<c0684fe9>] ? down_read+0x19/0x80 > [<c03e007b>] ? tty_audit_add_data+0x1bb/0x2e0 > [<c013c33b>] ? exit_mm+0x2b/0x110 > [<c013d7c4>] ? do_exit+0x184/0x890 > [<c013b57b>] ? printk+0x1b/0x20 > [<c013a52a>] ? print_oops_end_marker+0x2a/0x30 > [<c01060f1>] ? oops_end+0xb1/0xc0 > [<c01067c0>] ? die+0x50/0x70 > [<c0106871>] ? do_trap+0x91/0xc0 > [<c0106940>] ? do_invalid_op+0x0/0xa0 > [<c01069c8>] ? do_invalid_op+0x88/0xa0 > [<c01b24c0>] ? cache_free_debugcheck+0x2e0/0x310 > [<c015c60a>] ? print_lock_contention_bug+0x1a/0xe0 > [<c0182527>] ? rcu_irq_exit+0x17/0x90 > [<c0686cba>] ? error_code+0x72/0x78 > [<c018976e>] ? mempool_free_slab+0xe/0x10 > [<c068007b>] ? start_secondary+0x19b/0x1c0 > [<c06800d8>] ? check_tsc_warp+0x38/0x1e0 > [<c01b24c0>] ? cache_free_debugcheck+0x2e0/0x310 > [<c0683402>] ? wait_for_completion+0x12/0x20 > [<c018976e>] ? mempool_free_slab+0xe/0x10 > [<c036a7b0>] ? blk_end_sync_rq+0x0/0x30 > [<c01b3c9c>] ? kmem_cache_free+0x5c/0x200 > [<c018976e>] ? mempool_free_slab+0xe/0x10 > [<c018979c>] ? mempool_free+0x2c/0x90 > [<c0367252>] ? __blk_put_request+0x62/0x90 > [<c036764c>] ? blk_put_request+0x2c/0x50 > [<c0470732>] ? generic_ide_resume+0xa2/0xf0 > [<c03f767e>] ? device_resume+0x32e/0x380 > [<c01687d1>] ? hibernation_snapshot+0xa1/0x220 > [<c013b57b>] ? printk+0x1b/0x20 > [<c0168a30>] ? hibernate+0xe0/0x180 > [<c01674e0>] ? state_store+0x0/0xd0 > [<c016759f>] ? state_store+0xbf/0xd0 > [<c01674e0>] ? state_store+0x0/0xd0 > [<c03749d4>] ? kobj_attr_store+0x24/0x30 > [<c01f8ef2>] ? sysfs_write_file+0xa2/0x100 > [<c01ba946>] ? vfs_write+0x96/0x130 > [<c01f8e50>] ? sysfs_write_file+0x0/0x100 > [<c01bae8d>] ? sys_write+0x3d/0x70 > [<c0104f3b>] ? sysenter_do_call+0x12/0x3f > ======================= > Code: c0 e8 10 0d db ff 8b 13 eb 97 8d b6 00 00 00 00 8d bf 00 00 00 00 55 89 e5 > 83 ec 24 89 5d f4 89 cb 89 75 f8 89 d6 89 7d fc 89 c7 <8b> 41 04 39 d0 75 1d 8b > 06 39 d8 75 41 89 7b 04 89 1f 8b 5d f4 > EIP: [<c0389845>] __list_add+0x15/0x90 SS:ESP 0068:f25b16b8 > Kernel panic - not syncing: Fatal exception in interrupt > ------------[ cut here ]------------ > WARNING: at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/kernel/smp.c:328 > smp_call_function_mask+0x1b1/0x1c0() > Pid: 3257, comm: S99local Tainted: G D W 2.6.27-rc4-00f9b1bc #33 > [<c013a97f>] warn_on_slowpath+0x4f/0x80 > [<c0397ff4>] ? vgacon_set_cursor_size+0xa4/0x120 > [<c015b60b>] ? trace_hardirqs_off+0xb/0x10 > [<c018214b>] ? __rcu_read_unlock+0x9b/0xc0 > [<c015332c>] ? __atomic_notifier_call_chain+0x3c/0x50 > [<c03db7fd>] ? vt_console_print+0x1dd/0x2a0 > [<c015335a>] ? atomic_notifier_call_chain+0x1a/0x20 > [<c03890b6>] ? _raw_spin_unlock+0x46/0x80 > [<c068680c>] ? _spin_unlock+0x3c/0x50 > [<c03db7fd>] ? vt_console_print+0x1dd/0x2a0 > [<c015b60b>] ? trace_hardirqs_off+0xb/0x10 > [<c013af61>] ? release_console_sem+0x1c1/0x1f0 > [<c015b60b>] ? trace_hardirqs_off+0xb/0x10 > [<c0165f81>] smp_call_function_mask+0x1b1/0x1c0 > [<c0119280>] ? stop_this_cpu+0x0/0x50 > [<c0684808>] ? mutex_unlock+0x8/0x10 > [<c015b60b>] ? trace_hardirqs_off+0xb/0x10 > [<c06847f1>] ? __mutex_unlock_slowpath+0x151/0x160 > [<c0684808>] ? mutex_unlock+0x8/0x10 > [<c016c46d>] ? crash_kexec+0x6d/0xc0 > [<c0684808>] ? mutex_unlock+0x8/0x10 > [<c016c46d>] ? crash_kexec+0x6d/0xc0 > [<c0119280>] ? stop_this_cpu+0x0/0x50 > [<c0165fc0>] smp_call_function+0x30/0x60 > [<c011937e>] native_smp_send_stop+0x1e/0x70 > [<c013a879>] panic+0x69/0x120 > [<c01060e5>] oops_end+0xa5/0xc0 > [<c01067c0>] die+0x50/0x70 > [<c0122b4f>] do_page_fault+0x1ef/0xa20 > [<c0122960>] ? do_page_fault+0x0/0xa20 > [<c0686cba>] error_code+0x72/0x78 > [<c0389845>] ? __list_add+0x15/0x90 > [<c06860b3>] ? _spin_lock+0x63/0x70 > [<c018b9e4>] rmqueue_bulk+0x54/0x80 > [<c018d3a7>] get_page_from_freelist+0x5a7/0x720 > [<c018dde0>] __alloc_pages_internal+0xa0/0x450 > [<c01b25a2>] kmem_getpages+0x62/0x110 > [<c01b2f4c>] cache_grow+0x39c/0x3b0 > [<c03890b6>] ? _raw_spin_unlock+0x46/0x80 > [<c01b3503>] cache_alloc_refill+0x1f3/0x230 > [<c01b36e5>] __kmalloc+0x1a5/0x1e0 > [<c03cd9b2>] ? tty_buffer_request_room+0xe2/0x130 > [<c03cd9b2>] tty_buffer_request_room+0xe2/0x130 > [<c03cfe0d>] tty_insert_flip_string_flags+0x2d/0xa0 > [<c03ee7b1>] receive_chars+0x161/0x290 > [<c03ef7e4>] serial8250_interrupt+0x134/0x150 > [<c017dc08>] handle_IRQ_event+0x28/0x70 > [<c017f1cf>] handle_edge_irq+0xaf/0x140 > [<c0108138>] do_IRQ+0x48/0xa0 > [<c037a264>] ? trace_hardirqs_off_thunk+0xc/0x18 > [<c0105a3c>] common_interrupt+0x28/0x30 > [<c013b2f1>] ? vprintk+0x151/0x3c0 > [<c0143e5f>] ? run_timer_softirq+0x19f/0x1d0 > [<c0143e5f>] ? run_timer_softirq+0x19f/0x1d0 > [<c013b57b>] printk+0x1b/0x20 > [<c016700a>] __print_symbol+0x2a/0x40 > [<c037a254>] ? trace_hardirqs_on_thunk+0xc/0x10 > [<c010509e>] ? restore_nocheck_notrace+0x0/0xe > [<c013b2f1>] ? vprintk+0x151/0x3c0 > [<c013b47b>] ? vprintk+0x2db/0x3c0 > [<c013f91a>] ? irq_exit+0x3a/0xa0 > [<c010509e>] ? restore_nocheck_notrace+0x0/0xe > [<c013b2f1>] ? vprintk+0x151/0x3c0 > [<c0106871>] ? do_trap+0x91/0xc0 > [<c013b57b>] ? printk+0x1b/0x20 > [<c0106871>] ? do_trap+0x91/0xc0 > [<c01061f0>] print_trace_address+0x40/0x50 > [<c0106871>] ? do_trap+0x91/0xc0 > [<c01062aa>] dump_trace+0xaa/0x120 > [<c0106346>] show_trace_log_lvl+0x26/0x40 > [<c0106b8a>] show_trace+0x1a/0x20 > [<c01074d2>] dump_stack+0x72/0x80 > [<c01312c1>] __might_sleep+0xf1/0x140 > [<c0684fe9>] down_read+0x19/0x80 > [<c03e007b>] ? tty_audit_add_data+0x1bb/0x2e0 > [<c013c33b>] exit_mm+0x2b/0x110 > [<c013d7c4>] do_exit+0x184/0x890 > [<c013b57b>] ? printk+0x1b/0x20 > [<c013a52a>] ? print_oops_end_marker+0x2a/0x30 > [<c01060f1>] oops_end+0xb1/0xc0 > [<c01067c0>] die+0x50/0x70 > [<c0106871>] do_trap+0x91/0xc0 > [<c0106940>] ? do_invalid_op+0x0/0xa0 > [<c01069c8>] do_invalid_op+0x88/0xa0 > [<c01b24c0>] ? cache_free_debugcheck+0x2e0/0x310 > [<c015c60a>] ? print_lock_contention_bug+0x1a/0xe0 > [<c0182527>] ? rcu_irq_exit+0x17/0x90 > [<c0686cba>] error_code+0x72/0x78 > [<c018976e>] ? mempool_free_slab+0xe/0x10 > [<c068007b>] ? start_secondary+0x19b/0x1c0 > [<c06800d8>] ? check_tsc_warp+0x38/0x1e0 > [<c01b24c0>] ? cache_free_debugcheck+0x2e0/0x310 > [<c0683402>] ? wait_for_completion+0x12/0x20 > [<c018976e>] ? mempool_free_slab+0xe/0x10 > [<c036a7b0>] ? blk_end_sync_rq+0x0/0x30 > [<c01b3c9c>] kmem_cache_free+0x5c/0x200 > [<c018976e>] mempool_free_slab+0xe/0x10 > [<c018979c>] mempool_free+0x2c/0x90 > [<c0367252>] __blk_put_request+0x62/0x90 > [<c036764c>] blk_put_request+0x2c/0x50 > [<c0470732>] generic_ide_resume+0xa2/0xf0 > [<c03f767e>] device_resume+0x32e/0x380 > [<c01687d1>] hibernation_snapshot+0xa1/0x220 > [<c013b57b>] ? printk+0x1b/0x20 > [<c0168a30>] hibernate+0xe0/0x180 > [<c01674e0>] ? state_store+0x0/0xd0 > [<c016759f>] state_store+0xbf/0xd0 > [<c01674e0>] ? state_store+0x0/0xd0 > [<c03749d4>] kobj_attr_store+0x24/0x30 > [<c01f8ef2>] sysfs_write_file+0xa2/0x100 > [<c01ba946>] vfs_write+0x96/0x130 > [<c01f8e50>] ? sysfs_write_file+0x0/0x100 > [<c01bae8d>] sys_write+0x3d/0x70 > [<c0104f3b>] sysenter_do_call+0x12/0x3f > ======================= > ---[ end trace 958cea1a710a109a ]--- > Rebooting in 10 seconds..Initializing cgroup subsys cpuset > > > Vegard > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-21 20:22 ` Rafael J. Wysocki @ 2008-08-21 20:26 ` Andrew Morton 0 siblings, 0 replies; 21+ messages in thread From: Andrew Morton @ 2008-08-21 20:26 UTC (permalink / raw) To: Rafael J. Wysocki; +Cc: vegard.nossum, linux-kernel, Pekka Enberg On Thu, 21 Aug 2008 22:22:44 +0200 "Rafael J. Wysocki" <rjw@sisk.pl> wrote: > On Thursday, 21 of August 2008, Vegard Nossum wrote: > > On Thu, Aug 21, 2008 at 9:37 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > Can you switch to SLAB and retest? I don't really think SLUB is the issue > > > here, but SLAB may give us additional information. > > > > Got this with SLAB: > > Well, it looks like this happened while devices were suspended, but I can't say > what's going wrong. Andrew? Who's the right person to CC? Pekka, I guess. > > BUG: unable to handle kernel NULL pointer dereference at 00000000 > > IP: [<c03897ac>] list_del+0xc/0x90 > > *pdpt = 0000000031480001 *pde = 0000000000000000 > > Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > > Pid: 7, comm: events/0 Not tainted (2.6.27-rc4-00003-gef9b1bc #33) > > EIP: 0060:[<c03897ac>] EFLAGS: 00010082 CPU: 0 > > EIP is at list_del+0xc/0x90 > > EAX: 00000000 EBX: f556afa0 ECX: f54a5380 EDX: f54e3128 > > ESI: f54f1ef0 EDI: f556afa0 EBP: f68b7ee0 ESP: f68b7ec8 > > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 > > Process events/0 (pid: 7, ti=f68b6000 task=f689c338 task.ti=f68b6000) > > Stack: c015b62e f68b7ed4 c015b66d f68b7efc 00000046 f54f1f20 f68b7f1c c01b4054 > > 00000000 f689c338 f54e315c 00000001 f54f1f20 f54a5380 00000000 f54e3128 > > f55b3348 00000000 f54f1f20 f54f1ef0 00000001 f68b7f3c c01b41ba 00000000 > > Call Trace: > > [<c015b62e>] ? get_lock_stats+0x1e/0x50 > > [<c015b66d>] ? put_lock_stats+0xd/0x30 > > [<c01b4054>] ? free_block+0xa4/0x1b0 > > [<c01b41ba>] ? drain_array+0x5a/0xb0 > > [<c01b4f82>] ? cache_reap+0x72/0x230 > > [<c014ada7>] ? run_workqueue+0x107/0x200 > > [<c014ae0a>] ? run_workqueue+0x16a/0x200 > > [<c014ada7>] ? run_workqueue+0x107/0x200 > > [<c01b4f10>] ? cache_reap+0x0/0x230 > > [<c014b94d>] ? worker_thread+0x7d/0xe0 > > [<c014e4f0>] ? autoremove_wake_function+0x0/0x50 > > [<c014b8d0>] ? worker_thread+0x0/0xe0 > > [<c014e1e2>] ? kthread+0x42/0x70 > > [<c014e1a0>] ? kthread+0x0/0x70 > > [<c0105cf3>] ? kernel_thread_helper+0x7/0x14 > > ======================= > > Code: e8 01 89 44 24 04 e8 d4 1d db ff 8b 55 04 b8 48 2f 7a c0 e8 47 d8 dd ff e8 > > c2 dc d7 ff eb a9 55 89 e5 53 89 c3 83 ec 14 8b 40 04 <8b> 00 39 d8 75 24 8b 13 > > 8b 42 04 39 d8 75 41 8b 43 04 89 42 04 > > EIP: [<c03897ac>] list_del+0xc/0x90 SS:ESP 0068:f68b7ec8 But slab hasn't changed in quite some time. I expect something else has reached over and scribbled on it. Bisection time? ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-21 20:08 ` Vegard Nossum 2008-08-21 20:22 ` Rafael J. Wysocki @ 2008-08-21 21:21 ` Pekka Enberg 2008-08-21 22:10 ` Rafael J. Wysocki 1 sibling, 1 reply; 21+ messages in thread From: Pekka Enberg @ 2008-08-21 21:21 UTC (permalink / raw) To: Vegard Nossum Cc: Rafael J. Wysocki, Linux Kernel Mailing List, Andrew Morton, Jens Axboe Hi Vegard, On Thu, Aug 21, 2008 at 11:08 PM, Vegard Nossum <vegard.nossum@gmail.com> wrote: > On Thu, Aug 21, 2008 at 9:37 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: >> Can you switch to SLAB and retest? I don't really think SLUB is the issue >> here, but SLAB may give us additional information. > > Got this with SLAB: > > BUG: unable to handle kernel NULL pointer dereference at 00000000 > IP: [<c03897ac>] list_del+0xc/0x90 > *pdpt = 0000000031480001 *pde = 0000000000000000 > Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > Pid: 7, comm: events/0 Not tainted (2.6.27-rc4-00003-gef9b1bc #33) > EIP: 0060:[<c03897ac>] EFLAGS: 00010082 CPU: 0 > EIP is at list_del+0xc/0x90 > EAX: 00000000 EBX: f556afa0 ECX: f54a5380 EDX: f54e3128 > ESI: f54f1ef0 EDI: f556afa0 EBP: f68b7ee0 ESP: f68b7ec8 > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 > Process events/0 (pid: 7, ti=f68b6000 task=f689c338 task.ti=f68b6000) > Stack: c015b62e f68b7ed4 c015b66d f68b7efc 00000046 f54f1f20 f68b7f1c c01b4054 > 00000000 f689c338 f54e315c 00000001 f54f1f20 f54a5380 00000000 f54e3128 > f55b3348 00000000 f54f1f20 f54f1ef0 00000001 f68b7f3c c01b41ba 00000000 > Call Trace: > [<c015b62e>] ? get_lock_stats+0x1e/0x50 > [<c015b66d>] ? put_lock_stats+0xd/0x30 Looks like slabp->list is corrupted when we do list_del() in free_block(). Why is the stack trace so unreliable, though? > [<c01b4054>] ? free_block+0xa4/0x1b0 > [<c01b41ba>] ? drain_array+0x5a/0xb0 > [<c01b4f82>] ? cache_reap+0x72/0x230 > [<c014ada7>] ? run_workqueue+0x107/0x200 > [<c014ae0a>] ? run_workqueue+0x16a/0x200 > [<c014ada7>] ? run_workqueue+0x107/0x200 > [<c01b4f10>] ? cache_reap+0x0/0x230 > [<c014b94d>] ? worker_thread+0x7d/0xe0 > [<c014e4f0>] ? autoremove_wake_function+0x0/0x50 > [<c014b8d0>] ? worker_thread+0x0/0xe0 > [<c014e1e2>] ? kthread+0x42/0x70 > [<c014e1a0>] ? kthread+0x0/0x70 > [<c0105cf3>] ? kernel_thread_helper+0x7/0x14 > ======================= > Code: e8 01 89 44 24 04 e8 d4 1d db ff 8b 55 04 b8 48 2f 7a c0 e8 47 d8 dd ff e8 > c2 dc d7 ff eb a9 55 89 e5 53 89 c3 83 ec 14 8b 40 04 <8b> 00 39 d8 75 24 8b 13 > 8b 42 04 39 d8 75 41 8b 43 04 89 42 04 > EIP: [<c03897ac>] list_del+0xc/0x90 SS:ESP 0068:f68b7ec8 > ---[ end trace 958cea1a710a109a ]--- > note: events/0[7] exited with preempt_count 1 > uhci_hcd 0000:00:1d.2: setting latency timer to 64 > uhci_hcd 0000:00:1d.3: setting latency timer to 64 > ehci_hcd 0000:00:1d.7: setting latency timer to 64 > pci 0000:00:1e.0: setting latency timer to 64 > ------------[ cut here ]------------ > WARNING: at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/lib/list_debug.c:26 > __list_add+0x61/0x90() > list_add corruption. next->prev should be prev (f6c00490), but was > f73ecea8. (next=f73ecea8). > Pid: 201, comm: rcu_torture_rea Tainted: G D > 2.6.27-rc4-00003-gef9b1bc #33 > [<c013a58e>] warn_slowpath+0x5e/0x80 > [<c015b60b>] ? trace_hardirqs_off+0xb/0x10 > [<c010b325>] ? native_sched_clock+0xb5/0x110 > [<c0125c36>] ? kernel_map_pages+0xa6/0x130 This looks most interesting. The free_list in zone->free_area is corrupted. > [<c0389891>] __list_add+0x61/0x90 > [<c018c978>] __free_pages_ok+0x368/0x410 > [<c018ccf2>] __free_pages+0x22/0x40 > [<c018cd58>] free_pages+0x48/0x50 > [<c010be69>] free_thread_info+0x19/0x20 > [<c0138929>] free_task+0x19/0x30 > [<c0139e31>] __put_task_struct+0x51/0xa0 > [<c013c0b7>] delayed_put_task_struct+0x27/0x30 > [<c018206c>] rcu_process_callbacks+0x6c/0xb0 > [<c013f5b3>] __do_softirq+0x83/0x100 > [<c013f6d5>] do_softirq+0xa5/0xb0 > [<c013f975>] irq_exit+0x95/0xa0 > [<c010813d>] do_IRQ+0x4d/0xa0 > [<c037a264>] ? trace_hardirqs_off_thunk+0xc/0x18 > [<c0105a3c>] common_interrupt+0x28/0x30 > [<c0683b44>] ? schedule+0x734/0x8f0 > [<c068007b>] ? start_secondary+0x19b/0x1c0 > [<c06800d8>] ? check_tsc_warp+0x38/0x1e0 > [<c068695b>] ? _spin_unlock_irq+0x2b/0x60 > [<c0683b44>] schedule+0x734/0x8f0 > [<c010509e>] ? restore_nocheck_notrace+0x0/ > [<c015e93b>] ? trace_hardirqs_on+0xb/0x10 > [<c0180c88>] rcu_torture_reader+0x148/0x230 > [<c0181460>] ? rcu_torture_timer+0x0/0x100 > [<c0180d19>] ? rcu_torture_reader+0x1d9/0x230 > [<c0180b40>] ? rcu_torture_reader+0x0/0x230 > [<c014e1e2>] kthread+0x42/0x70 > [<c014e1a0>] ? kthread+0x0/0x70 > [<c0105cf3>] kernel_thread_helper+0x7/0x14 > ======================= > ---[ end trace 958cea1a710a109a ]--- > eth0: link up, 100Mbps, full-duplex, lpa 0x45E1 > eth1: link down > hda: host max PIO4 wanted PIO255(auto-tune) selected PIO4 > hda: UDMA/100 mode selected > ------------[ cut here ]------------ > kernel BUG at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/mm/slab.c:590! > invalid opcode: 0000 [#2] PREEMPT SMP DEBUG_PAGEALLOC > Pid: 3257, comm: S99local Tainted: G D W (2.6.27-rc4-00003-gef9b1bc #33) > EIP: 0060:[<c01b24c0>] EFLAGS: 00210002 CPU: 0 > EIP is at cache_free_debugcheck+0x2e0/0x310 > EAX: 00000000 EBX: f4c1f688 ECX: c018976e EDX: f785f744 > ESI: f6864180 EDI: f4c1f688 EBP: f25b1de8 ESP: f25b1d94 > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 > Process S99local (pid: 3257, ti=f25b0000 task=f253f338 task.ti=f25b0000) > Stack: 00200200 f4c1f688 00000000 f25b1e20 f25b1dac c0683402 f25b1e58 c018976e > 00000001 c036a7b0 00000002 00000000 00000000 c088c190 f253f338 f4c1f688 > f253f338 f25b1df8 f6864180 f6960ef0 f4c1f688 f25b1e0c c01b3c9c f253f7f4 > Call Trace: > [<c0683402>] ? wait_for_completion+0x12/0x20 Here the block layer is passing a pointer to a non-slab page to kmem_cache_free(). Assuming we can rely on the stack trace, of course... > [<c018976e>] ? mempool_free_slab+0xe/0x10 > [<c036a7b0>] ? blk_end_sync_rq+0x0/0x30 > [<c01b3c9c>] ? kmem_cache_free+0x5c/0x200 > [<c018976e>] ? mempool_free_slab+0xe/0x10 > [<c018979c>] ? mempool_free+0x2c/0x90 > [<c0367252>] ? __blk_put_request+0x62/0x90 > [<c036764c>] ? blk_put_request+0x2c/0x50 > [<c0470732>] ? generic_ide_resume+0xa2/0xf0 > [<c03f767e>] ? device_resume+0x32e/0x380 > [<c01687d1>] ? hibernation_snapshot+0xa1/0x220 > [<c013b57b>] ? printk+0x1b/0x20 > [<c0168a30>] ? hibernate+0xe0/0x180 > [<c01674e0>] ? state_store+0x0/0xd0 > [<c016759f>] ? state_store+0xbf/0xd0 > [<c01674e0>] ? state_store+0x0/0xd0 > [<c03749d4>] ? kobj_attr_store+0x24/0x30 > [<c01f8ef2>] ? sysfs_write_file+0xa2/0x100 > [<c01ba946>] ? vfs_write+0x96/0x130 > [<c01f8e50>] ? sysfs_write_file+0x0/0x100 > [<c01bae8d>] ? sys_write+0x3d/0x70 > [<c0104f3b>] ? sysenter_do_call+0x12/0x3f > ======================= > Code: 00 00 75 cd c7 03 21 43 65 87 8b 5e 28 e9 78 ff ff ff 0f 0b eb fe 90 8d 74 > 26 00 8b 52 0c e9 ca fd ff ff 0f 0b eb fe 8d 74 26 00 <0f> 0b eb fe 0f 0b eb fe > 0f 0b eb fe 8d 74 26 00 8b 52 0c 8b 02 > EIP: [<c01b24c0>] cache_free_debugcheck+0x2e0/0x310 SS:ESP 0068:f25b1d94 > ---[ end trace 958cea1a710a109a ]--- > note: S99local[3257] exited with preempt_count 1 > BUG: unable to handle kernel paging request at 00100104 > IP: [<c0389845>] __list_add+0x15/0x90 > *pdpt = 0000000031480001 *pde = 0000000000000000 > Oops: 0000 [#3] PREEMPT SMP DEBUG_PAGEALLOC > Pid: 3257, comm: S99local Tainted: G D W (2.6.27-rc4-00003-gef9b1bc #33) > EIP: 0060:[<c0389845>] EFLAGS: 00210086 CPU: 0 > EIP is at __list_add+0x15/0x90 > EAX: f73ece6c EBX: 00100100 ECX: 00100100 EDX: f73ece6c > ESI: f73ece6c EDI: f73ece6c EBP: f25b16dc ESP: f25b16b8 > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 > Process S99local (pid: 3257, ti=f25b0000 task=f253f338 task.ti=f25b0000) > Stack: f6814e44 f6c00400 c06860b3 00000000 00000002 00000000 f73ece3c f73ece6c > f73ece6c f25b1704 c018b9e4 0000001f 00000000 f6c00400 f6c00444 00000001 > f6814e44 f6814e38 00200002 f25b1778 c018d3a7 f6814e44 00000000 00000000 > Call Trace: > [<c06860b3>] ? _spin_lock+0x63/0x70 And now we have the per-cpu page lists corrupted as well. Note that even though we have tty showing up in the stack traces, the list has already been corrupted by someone else. So I don't think tty is at fault here. > [<c018b9e4>] ? rmqueue_bulk+0x54/0x80 > [<c018d3a7>] ? get_page_from_freelist+0x5a7/0x720 > [<c018dde0>] ? __alloc_pages_internal+0xa0/0x450 > [<c01b25a2>] ? kmem_getpages+0x62/0x110 > [<c01b2f4c>] ? cache_grow+0x39c/0x3b0 > [<c03890b6>] ? _raw_spin_unlock+0x46/0x80 > [<c01b3503>] ? cache_alloc_refill+0x1f3/0x230 > [<c01b36e5>] ? __kmalloc+0x1a5/0x1e0 > [<c03cd9b2>] ? tty_buffer_request_room+0xe2/0x130 > [<c03cd9b2>] ? tty_buffer_request_room+0xe2/0x130 > [<c03cfe0d>] ? tty_insert_flip_string_flags+0x2d/0xa0 > [<c03ee7b1>] ? receive_chars+0x161/0x290 > [<c03ef7e4>] ? serial8250_interrupt+0x134/0x150 > [<c017dc08>] ? handle_IRQ_event+0x28/0x70 > [<c017f1cf>] ? handle_edge_irq+0xaf/0x140 > [<c0108138>] ? do_IRQ+0x48/0xa0 > [<c037a264>] ? trace_hardirqs_off_thunk+0xc/0x18 > [<c0105a3c>] ? common_interrupt+0x28/0x30 > [<c013b2f1>] ? vprintk+0x151/0x3c0 > [<c0143e5f>] ? run_timer_softirq+0x19f/0x1d0 > [<c0143e5f>] ? run_timer_softirq+0x19f/0x1d0 > [<c013b57b>] ? printk+0x1b/0x20 > [<c016700a>] ? __print_symbol+0x2a/0x40 > [<c037a254>] ? trace_hardirqs_on_thunk+0xc/0x10 > [<c010509e>] ? restore_nocheck_notrace+0x0/0xe > [<c013b2f1>] ? vprintk+0x151/0x3c0 > [<c013b47b>] ? vprintk+0x2db/0x3c0 > [<c013f91a>] ? irq_exit+0x3a/0xa0 > [<c010509e>] ? restore_nocheck_notrace+0x0/0xe > [<c013b2f1>] ? vprintk+0x151/0x3c0 > [<c0106871>] ? do_trap+0x91/0xc0 > [<c013b57b>] ? printk+0x1b/0x20 > [<c0106871>] ? do_trap+0x91/0xc0 > [<c01061f0>] ? print_trace_address+0x40/0x50 > [<c0106871>] ? do_trap+0x91/0xc0 > [<c01062aa>] ? dump_trace+0xaa/0x120 > [<c0106346>] ? show_trace_log_lvl+0x26/0x40 > [<c0106b8a>] ? show_trace+0x1a/0x20 > [<c01074d2>] ? dump_stack+0x72/0x80 > [<c01312c1>] ? __might_sleep+0xf1/0x140 > [<c0684fe9>] ? down_read+0x19/0x80 > [<c03e007b>] ? tty_audit_add_data+0x1bb/0x2e0 > [<c013c33b>] ? exit_mm+0x2b/0x110 > [<c013d7c4>] ? do_exit+0x184/0x890 > [<c013b57b>] ? printk+0x1b/0x20 > [<c013a52a>] ? print_oops_end_marker+0x2a/0x30 > [<c01060f1>] ? oops_end+0xb1/0xc0 > [<c01067c0>] ? die+0x50/0x70 > [<c0106871>] ? do_trap+0x91/0xc0 > [<c0106940>] ? do_invalid_op+0x0/0xa0 > [<c01069c8>] ? do_invalid_op+0x88/0xa0 > [<c01b24c0>] ? cache_free_debugcheck+0x2e0/0x310 > [<c015c60a>] ? print_lock_contention_bug+0x1a/0xe0 > [<c0182527>] ? rcu_irq_exit+0x17/0x90 > [<c0686cba>] ? error_code+0x72/0x78 > [<c018976e>] ? mempool_free_slab+0xe/0x10 > [<c068007b>] ? start_secondary+0x19b/0x1c0 > [<c06800d8>] ? check_tsc_warp+0x38/0x1e0 > [<c01b24c0>] ? cache_free_debugcheck+0x2e0/0x310 > [<c0683402>] ? wait_for_completion+0x12/0x20 > [<c018976e>] ? mempool_free_slab+0xe/0x10 > [<c036a7b0>] ? blk_end_sync_rq+0x0/0x30 > [<c01b3c9c>] ? kmem_cache_free+0x5c/0x200 > [<c018976e>] ? mempool_free_slab+0xe/0x10 > [<c018979c>] ? mempool_free+0x2c/0x90 > [<c0367252>] ? __blk_put_request+0x62/0x90 > [<c036764c>] ? blk_put_request+0x2c/0x50 > [<c0470732>] ? generic_ide_resume+0xa2/0xf0 > [<c03f767e>] ? device_resume+0x32e/0x380 > [<c01687d1>] ? hibernation_snapshot+0xa1/0x220 > [<c013b57b>] ? printk+0x1b/0x20 > [<c0168a30>] ? hibernate+0xe0/0x180 > [<c01674e0>] ? state_store+0x0/0xd0 > [<c016759f>] ? state_store+0xbf/0xd0 > [<c01674e0>] ? state_store+0x0/0xd0 > [<c03749d4>] ? kobj_attr_store+0x24/0x30 > [<c01f8ef2>] ? sysfs_write_file+0xa2/0x100 > [<c01ba946>] ? vfs_write+0x96/0x130 > [<c01f8e50>] ? sysfs_write_file+0x0/0x100 > [<c01bae8d>] ? sys_write+0x3d/0x70 > [<c0104f3b>] ? sysenter_do_call+0x12/0x3f > ======================= > Code: c0 e8 10 0d db ff 8b 13 eb 97 8d b6 00 00 00 00 8d bf 00 00 00 00 55 89 e5 > 83 ec 24 89 5d f4 89 cb 89 75 f8 89 d6 89 7d fc 89 c7 <8b> 41 04 39 d0 75 1d 8b > 06 39 d8 75 41 89 7b 04 89 1f 8b 5d f4 > EIP: [<c0389845>] __list_add+0x15/0x90 SS:ESP 0068:f25b16b8 I don't know the suspend code at all but the bogus pointer coming from hibernation_snapshot() seems suspicious. Another possibility is that the block layer is doing something strange here. Dunno. Pekka ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-21 21:21 ` Pekka Enberg @ 2008-08-21 22:10 ` Rafael J. Wysocki 2008-08-22 7:02 ` Pekka Enberg 2008-08-22 9:38 ` Pavel Machek 0 siblings, 2 replies; 21+ messages in thread From: Rafael J. Wysocki @ 2008-08-21 22:10 UTC (permalink / raw) To: Pekka Enberg Cc: Vegard Nossum, Linux Kernel Mailing List, Andrew Morton, Jens Axboe On Thursday, 21 of August 2008, Pekka Enberg wrote: > Hi Vegard, > > On Thu, Aug 21, 2008 at 11:08 PM, Vegard Nossum <vegard.nossum@gmail.com> wrote: > > On Thu, Aug 21, 2008 at 9:37 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > >> Can you switch to SLAB and retest? I don't really think SLUB is the issue > >> here, but SLAB may give us additional information. > > > > Got this with SLAB: > > > > BUG: unable to handle kernel NULL pointer dereference at 00000000 > > IP: [<c03897ac>] list_del+0xc/0x90 > > *pdpt = 0000000031480001 *pde = 0000000000000000 > > Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > > Pid: 7, comm: events/0 Not tainted (2.6.27-rc4-00003-gef9b1bc #33) > > EIP: 0060:[<c03897ac>] EFLAGS: 00010082 CPU: 0 > > EIP is at list_del+0xc/0x90 > > EAX: 00000000 EBX: f556afa0 ECX: f54a5380 EDX: f54e3128 > > ESI: f54f1ef0 EDI: f556afa0 EBP: f68b7ee0 ESP: f68b7ec8 > > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 > > Process events/0 (pid: 7, ti=f68b6000 task=f689c338 task.ti=f68b6000) > > Stack: c015b62e f68b7ed4 c015b66d f68b7efc 00000046 f54f1f20 f68b7f1c c01b4054 > > 00000000 f689c338 f54e315c 00000001 f54f1f20 f54a5380 00000000 f54e3128 > > f55b3348 00000000 f54f1f20 f54f1ef0 00000001 f68b7f3c c01b41ba 00000000 > > Call Trace: > > [<c015b62e>] ? get_lock_stats+0x1e/0x50 > > [<c015b66d>] ? put_lock_stats+0xd/0x30 > > Looks like slabp->list is corrupted when we do list_del() in > free_block(). Why is the stack trace so unreliable, though? > > > [<c01b4054>] ? free_block+0xa4/0x1b0 > > [<c01b41ba>] ? drain_array+0x5a/0xb0 > > [<c01b4f82>] ? cache_reap+0x72/0x230 > > [<c014ada7>] ? run_workqueue+0x107/0x200 > > [<c014ae0a>] ? run_workqueue+0x16a/0x200 > > [<c014ada7>] ? run_workqueue+0x107/0x200 > > [<c01b4f10>] ? cache_reap+0x0/0x230 > > [<c014b94d>] ? worker_thread+0x7d/0xe0 > > [<c014e4f0>] ? autoremove_wake_function+0x0/0x50 > > [<c014b8d0>] ? worker_thread+0x0/0xe0 > > [<c014e1e2>] ? kthread+0x42/0x70 > > [<c014e1a0>] ? kthread+0x0/0x70 > > [<c0105cf3>] ? kernel_thread_helper+0x7/0x14 > > ======================= > > Code: e8 01 89 44 24 04 e8 d4 1d db ff 8b 55 04 b8 48 2f 7a c0 e8 47 d8 dd ff e8 > > c2 dc d7 ff eb a9 55 89 e5 53 89 c3 83 ec 14 8b 40 04 <8b> 00 39 d8 75 24 8b 13 > > 8b 42 04 39 d8 75 41 8b 43 04 89 42 04 > > EIP: [<c03897ac>] list_del+0xc/0x90 SS:ESP 0068:f68b7ec8 > > ---[ end trace 958cea1a710a109a ]--- > > note: events/0[7] exited with preempt_count 1 > > uhci_hcd 0000:00:1d.2: setting latency timer to 64 > > uhci_hcd 0000:00:1d.3: setting latency timer to 64 > > ehci_hcd 0000:00:1d.7: setting latency timer to 64 > > pci 0000:00:1e.0: setting latency timer to 64 > > ------------[ cut here ]------------ > > WARNING: at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/lib/list_debug.c:26 > > __list_add+0x61/0x90() > > list_add corruption. next->prev should be prev (f6c00490), but was > > f73ecea8. (next=f73ecea8). > > Pid: 201, comm: rcu_torture_rea Tainted: G D > > 2.6.27-rc4-00003-gef9b1bc #33 > > [<c013a58e>] warn_slowpath+0x5e/0x80 > > [<c015b60b>] ? trace_hardirqs_off+0xb/0x10 > > [<c010b325>] ? native_sched_clock+0xb5/0x110 > > [<c0125c36>] ? kernel_map_pages+0xa6/0x130 > > This looks most interesting. The free_list in zone->free_area is corrupted. > > > [<c0389891>] __list_add+0x61/0x90 > > [<c018c978>] __free_pages_ok+0x368/0x410 > > [<c018ccf2>] __free_pages+0x22/0x40 > > [<c018cd58>] free_pages+0x48/0x50 > > [<c010be69>] free_thread_info+0x19/0x20 > > [<c0138929>] free_task+0x19/0x30 > > [<c0139e31>] __put_task_struct+0x51/0xa0 > > [<c013c0b7>] delayed_put_task_struct+0x27/0x30 > > [<c018206c>] rcu_process_callbacks+0x6c/0xb0 > > [<c013f5b3>] __do_softirq+0x83/0x100 > > [<c013f6d5>] do_softirq+0xa5/0xb0 > > [<c013f975>] irq_exit+0x95/0xa0 > > [<c010813d>] do_IRQ+0x4d/0xa0 > > [<c037a264>] ? trace_hardirqs_off_thunk+0xc/0x18 > > [<c0105a3c>] common_interrupt+0x28/0x30 > > [<c0683b44>] ? schedule+0x734/0x8f0 > > [<c068007b>] ? start_secondary+0x19b/0x1c0 > > [<c06800d8>] ? check_tsc_warp+0x38/0x1e0 > > [<c068695b>] ? _spin_unlock_irq+0x2b/0x60 > > [<c0683b44>] schedule+0x734/0x8f0 > > [<c010509e>] ? restore_nocheck_notrace+0x0/ > > [<c015e93b>] ? trace_hardirqs_on+0xb/0x10 > > [<c0180c88>] rcu_torture_reader+0x148/0x230 > > [<c0181460>] ? rcu_torture_timer+0x0/0x100 > > [<c0180d19>] ? rcu_torture_reader+0x1d9/0x230 > > [<c0180b40>] ? rcu_torture_reader+0x0/0x230 > > [<c014e1e2>] kthread+0x42/0x70 > > [<c014e1a0>] ? kthread+0x0/0x70 > > [<c0105cf3>] kernel_thread_helper+0x7/0x14 > > ======================= > > ---[ end trace 958cea1a710a109a ]--- > > eth0: link up, 100Mbps, full-duplex, lpa 0x45E1 > > eth1: link down > > hda: host max PIO4 wanted PIO255(auto-tune) selected PIO4 > > hda: UDMA/100 mode selected > > ------------[ cut here ]------------ > > kernel BUG at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/mm/slab.c:590! > > invalid opcode: 0000 [#2] PREEMPT SMP DEBUG_PAGEALLOC > > Pid: 3257, comm: S99local Tainted: G D W (2.6.27-rc4-00003-gef9b1bc #33) > > EIP: 0060:[<c01b24c0>] EFLAGS: 00210002 CPU: 0 > > EIP is at cache_free_debugcheck+0x2e0/0x310 > > EAX: 00000000 EBX: f4c1f688 ECX: c018976e EDX: f785f744 > > ESI: f6864180 EDI: f4c1f688 EBP: f25b1de8 ESP: f25b1d94 > > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 > > Process S99local (pid: 3257, ti=f25b0000 task=f253f338 task.ti=f25b0000) > > Stack: 00200200 f4c1f688 00000000 f25b1e20 f25b1dac c0683402 f25b1e58 c018976e > > 00000001 c036a7b0 00000002 00000000 00000000 c088c190 f253f338 f4c1f688 > > f253f338 f25b1df8 f6864180 f6960ef0 f4c1f688 f25b1e0c c01b3c9c f253f7f4 > > Call Trace: > > [<c0683402>] ? wait_for_completion+0x12/0x20 > > Here the block layer is passing a pointer to a non-slab page to > kmem_cache_free(). Assuming we can rely on the stack trace, of > course... > > > [<c018976e>] ? mempool_free_slab+0xe/0x10 > > [<c036a7b0>] ? blk_end_sync_rq+0x0/0x30 > > [<c01b3c9c>] ? kmem_cache_free+0x5c/0x200 > > [<c018976e>] ? mempool_free_slab+0xe/0x10 > > [<c018979c>] ? mempool_free+0x2c/0x90 > > [<c0367252>] ? __blk_put_request+0x62/0x90 > > [<c036764c>] ? blk_put_request+0x2c/0x50 > > [<c0470732>] ? generic_ide_resume+0xa2/0xf0 > > [<c03f767e>] ? device_resume+0x32e/0x380 > > [<c01687d1>] ? hibernation_snapshot+0xa1/0x220 > > [<c013b57b>] ? printk+0x1b/0x20 > > [<c0168a30>] ? hibernate+0xe0/0x180 > > [<c01674e0>] ? state_store+0x0/0xd0 > > [<c016759f>] ? state_store+0xbf/0xd0 > > [<c01674e0>] ? state_store+0x0/0xd0 > > [<c03749d4>] ? kobj_attr_store+0x24/0x30 > > [<c01f8ef2>] ? sysfs_write_file+0xa2/0x100 > > [<c01ba946>] ? vfs_write+0x96/0x130 > > [<c01f8e50>] ? sysfs_write_file+0x0/0x100 > > [<c01bae8d>] ? sys_write+0x3d/0x70 > > [<c0104f3b>] ? sysenter_do_call+0x12/0x3f > > ======================= > > Code: 00 00 75 cd c7 03 21 43 65 87 8b 5e 28 e9 78 ff ff ff 0f 0b eb fe 90 8d 74 > > 26 00 8b 52 0c e9 ca fd ff ff 0f 0b eb fe 8d 74 26 00 <0f> 0b eb fe 0f 0b eb fe > > 0f 0b eb fe 8d 74 26 00 8b 52 0c 8b 02 > > EIP: [<c01b24c0>] cache_free_debugcheck+0x2e0/0x310 SS:ESP 0068:f25b1d94 > > ---[ end trace 958cea1a710a109a ]--- > > note: S99local[3257] exited with preempt_count 1 > > > BUG: unable to handle kernel paging request at 00100104 > > IP: [<c0389845>] __list_add+0x15/0x90 > > *pdpt = 0000000031480001 *pde = 0000000000000000 > > Oops: 0000 [#3] PREEMPT SMP DEBUG_PAGEALLOC > > Pid: 3257, comm: S99local Tainted: G D W (2.6.27-rc4-00003-gef9b1bc #33) > > EIP: 0060:[<c0389845>] EFLAGS: 00210086 CPU: 0 > > EIP is at __list_add+0x15/0x90 > > EAX: f73ece6c EBX: 00100100 ECX: 00100100 EDX: f73ece6c > > ESI: f73ece6c EDI: f73ece6c EBP: f25b16dc ESP: f25b16b8 > > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 > > Process S99local (pid: 3257, ti=f25b0000 task=f253f338 task.ti=f25b0000) > > Stack: f6814e44 f6c00400 c06860b3 00000000 00000002 00000000 f73ece3c f73ece6c > > f73ece6c f25b1704 c018b9e4 0000001f 00000000 f6c00400 f6c00444 00000001 > > f6814e44 f6814e38 00200002 f25b1778 c018d3a7 f6814e44 00000000 00000000 > > Call Trace: > > [<c06860b3>] ? _spin_lock+0x63/0x70 > > And now we have the per-cpu page lists corrupted as well. Note that > even though we have tty showing up in the stack traces, the list has > already been corrupted by someone else. So I don't think tty is at > fault here. > > > [<c018b9e4>] ? rmqueue_bulk+0x54/0x80 > > [<c018d3a7>] ? get_page_from_freelist+0x5a7/0x720 > > [<c018dde0>] ? __alloc_pages_internal+0xa0/0x450 > > [<c01b25a2>] ? kmem_getpages+0x62/0x110 > > [<c01b2f4c>] ? cache_grow+0x39c/0x3b0 > > [<c03890b6>] ? _raw_spin_unlock+0x46/0x80 > > [<c01b3503>] ? cache_alloc_refill+0x1f3/0x230 > > [<c01b36e5>] ? __kmalloc+0x1a5/0x1e0 > > [<c03cd9b2>] ? tty_buffer_request_room+0xe2/0x130 > > [<c03cd9b2>] ? tty_buffer_request_room+0xe2/0x130 > > [<c03cfe0d>] ? tty_insert_flip_string_flags+0x2d/0xa0 > > [<c03ee7b1>] ? receive_chars+0x161/0x290 > > [<c03ef7e4>] ? serial8250_interrupt+0x134/0x150 > > [<c017dc08>] ? handle_IRQ_event+0x28/0x70 > > [<c017f1cf>] ? handle_edge_irq+0xaf/0x140 > > [<c0108138>] ? do_IRQ+0x48/0xa0 > > [<c037a264>] ? trace_hardirqs_off_thunk+0xc/0x18 > > [<c0105a3c>] ? common_interrupt+0x28/0x30 > > [<c013b2f1>] ? vprintk+0x151/0x3c0 > > [<c0143e5f>] ? run_timer_softirq+0x19f/0x1d0 > > [<c0143e5f>] ? run_timer_softirq+0x19f/0x1d0 > > [<c013b57b>] ? printk+0x1b/0x20 > > [<c016700a>] ? __print_symbol+0x2a/0x40 > > [<c037a254>] ? trace_hardirqs_on_thunk+0xc/0x10 > > [<c010509e>] ? restore_nocheck_notrace+0x0/0xe > > [<c013b2f1>] ? vprintk+0x151/0x3c0 > > [<c013b47b>] ? vprintk+0x2db/0x3c0 > > [<c013f91a>] ? irq_exit+0x3a/0xa0 > > [<c010509e>] ? restore_nocheck_notrace+0x0/0xe > > [<c013b2f1>] ? vprintk+0x151/0x3c0 > > [<c0106871>] ? do_trap+0x91/0xc0 > > [<c013b57b>] ? printk+0x1b/0x20 > > [<c0106871>] ? do_trap+0x91/0xc0 > > [<c01061f0>] ? print_trace_address+0x40/0x50 > > [<c0106871>] ? do_trap+0x91/0xc0 > > [<c01062aa>] ? dump_trace+0xaa/0x120 > > [<c0106346>] ? show_trace_log_lvl+0x26/0x40 > > [<c0106b8a>] ? show_trace+0x1a/0x20 > > [<c01074d2>] ? dump_stack+0x72/0x80 > > [<c01312c1>] ? __might_sleep+0xf1/0x140 > > [<c0684fe9>] ? down_read+0x19/0x80 > > [<c03e007b>] ? tty_audit_add_data+0x1bb/0x2e0 > > [<c013c33b>] ? exit_mm+0x2b/0x110 > > [<c013d7c4>] ? do_exit+0x184/0x890 > > [<c013b57b>] ? printk+0x1b/0x20 > > [<c013a52a>] ? print_oops_end_marker+0x2a/0x30 > > [<c01060f1>] ? oops_end+0xb1/0xc0 > > [<c01067c0>] ? die+0x50/0x70 > > [<c0106871>] ? do_trap+0x91/0xc0 > > [<c0106940>] ? do_invalid_op+0x0/0xa0 > > [<c01069c8>] ? do_invalid_op+0x88/0xa0 > > [<c01b24c0>] ? cache_free_debugcheck+0x2e0/0x310 > > [<c015c60a>] ? print_lock_contention_bug+0x1a/0xe0 > > [<c0182527>] ? rcu_irq_exit+0x17/0x90 > > [<c0686cba>] ? error_code+0x72/0x78 > > [<c018976e>] ? mempool_free_slab+0xe/0x10 > > [<c068007b>] ? start_secondary+0x19b/0x1c0 > > [<c06800d8>] ? check_tsc_warp+0x38/0x1e0 > > [<c01b24c0>] ? cache_free_debugcheck+0x2e0/0x310 > > [<c0683402>] ? wait_for_completion+0x12/0x20 > > [<c018976e>] ? mempool_free_slab+0xe/0x10 > > [<c036a7b0>] ? blk_end_sync_rq+0x0/0x30 > > [<c01b3c9c>] ? kmem_cache_free+0x5c/0x200 > > [<c018976e>] ? mempool_free_slab+0xe/0x10 > > [<c018979c>] ? mempool_free+0x2c/0x90 > > [<c0367252>] ? __blk_put_request+0x62/0x90 > > [<c036764c>] ? blk_put_request+0x2c/0x50 > > [<c0470732>] ? generic_ide_resume+0xa2/0xf0 > > [<c03f767e>] ? device_resume+0x32e/0x380 > > [<c01687d1>] ? hibernation_snapshot+0xa1/0x220 > > [<c013b57b>] ? printk+0x1b/0x20 > > [<c0168a30>] ? hibernate+0xe0/0x180 > > [<c01674e0>] ? state_store+0x0/0xd0 > > [<c016759f>] ? state_store+0xbf/0xd0 > > [<c01674e0>] ? state_store+0x0/0xd0 > > [<c03749d4>] ? kobj_attr_store+0x24/0x30 > > [<c01f8ef2>] ? sysfs_write_file+0xa2/0x100 > > [<c01ba946>] ? vfs_write+0x96/0x130 > > [<c01f8e50>] ? sysfs_write_file+0x0/0x100 > > [<c01bae8d>] ? sys_write+0x3d/0x70 > > [<c0104f3b>] ? sysenter_do_call+0x12/0x3f > > ======================= > > Code: c0 e8 10 0d db ff 8b 13 eb 97 8d b6 00 00 00 00 8d bf 00 00 00 00 55 89 e5 > > 83 ec 24 89 5d f4 89 cb 89 75 f8 89 d6 89 7d fc 89 c7 <8b> 41 04 39 d0 75 1d 8b > > 06 39 d8 75 41 89 7b 04 89 1f 8b 5d f4 > > EIP: [<c0389845>] __list_add+0x15/0x90 SS:ESP 0068:f25b16b8 > > I don't know the suspend code at all but the bogus pointer coming from > hibernation_snapshot() seems suspicious. Which one? > Another possibility is that the block layer is doing something strange here. > Dunno. I bet on the IDE driver. Thanks, Rafael ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-21 22:10 ` Rafael J. Wysocki @ 2008-08-22 7:02 ` Pekka Enberg 2008-08-22 9:38 ` Pavel Machek 1 sibling, 0 replies; 21+ messages in thread From: Pekka Enberg @ 2008-08-22 7:02 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Vegard Nossum, Linux Kernel Mailing List, Andrew Morton, Jens Axboe Hi Rafael, On Fri, Aug 22, 2008 at 1:10 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote: >> I don't know the suspend code at all but the bogus pointer coming from >> hibernation_snapshot() seems suspicious. > > Which one? This trace here (which has IDE too): > > Call Trace: > > [<c0683402>] ? wait_for_completion+0x12/0x20 > > Here the block layer is passing a pointer to a non-slab page to > kmem_cache_free(). Assuming we can rely on the stack trace, of > course... > > > [<c018976e>] ? mempool_free_slab+0xe/0x10 > > [<c036a7b0>] ? blk_end_sync_rq+0x0/0x30 > > [<c01b3c9c>] ? kmem_cache_free+0x5c/0x200 > > [<c018976e>] ? mempool_free_slab+0xe/0x10 > > [<c018979c>] ? mempool_free+0x2c/0x90 > > [<c0367252>] ? __blk_put_request+0x62/0x90 > > [<c036764c>] ? blk_put_request+0x2c/0x50 > > [<c0470732>] ? generic_ide_resume+0xa2/0xf0 <-- IDE > > [<c03f767e>] ? device_resume+0x32e/0x380 > > [<c01687d1>] ? hibernation_snapshot+0xa1/0x220 <--- Suspend > > [<c013b57b>] ? printk+0x1b/0x20 > > [<c0168a30>] ? hibernate+0xe0/0x180 > > [<c01674e0>] ? state_store+0x0/0xd0 > > [<c016759f>] ? state_store+0xbf/0xd0 > > [<c01674e0>] ? state_store+0x0/0xd0 > > [<c03749d4>] ? kobj_attr_store+0x24/0x30 > > [<c01f8ef2>] ? sysfs_write_file+0xa2/0x100 > > [<c01ba946>] ? vfs_write+0x96/0x130 > > [<c01f8e50>] ? sysfs_write_file+0x0/0x100 > > [<c01bae8d>] ? sys_write+0x3d/0x70 > > [<c0104f3b>] ? sysenter_do_call+0x12/0x3f But like I said, the stack traces seem more unreliable than usual so I could be totally wrong here. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-21 22:10 ` Rafael J. Wysocki 2008-08-22 7:02 ` Pekka Enberg @ 2008-08-22 9:38 ` Pavel Machek 1 sibling, 0 replies; 21+ messages in thread From: Pavel Machek @ 2008-08-22 9:38 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Pekka Enberg, Vegard Nossum, Linux Kernel Mailing List, Andrew Morton, Jens Axboe Hi! > > Another possibility is that the block layer is doing something strange here. > > Dunno. > > I bet on the IDE driver. Should be easy enough to switch to libata to verify this, right? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-21 18:21 ` Vegard Nossum 2008-08-21 18:45 ` Vegard Nossum @ 2008-08-21 21:27 ` Pekka Enberg 2008-08-21 22:16 ` Rafael J. Wysocki 1 sibling, 1 reply; 21+ messages in thread From: Pekka Enberg @ 2008-08-21 21:27 UTC (permalink / raw) To: Vegard Nossum Cc: Rafael J. Wysocki, Linux Kernel Mailing List, Andrew Morton, Jens Axboe On Thu, Aug 21, 2008 at 9:21 PM, Vegard Nossum <vegard.nossum@gmail.com> wrote: > I got a "proper" oops this time (brace yourself, it's long ;-)), and > it does not have serial on the stack: > > BUG: unable to handle kernel paging request at 00100104 > IP: [<c038ad65>] __list_add+0x15/0x90 > *pdpt = 00000000325bb001 *pde = 0000000000000000 > Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > Pid: 3597, comm: bash Not tainted (2.6.27-rc4-00003-ga798564-dirty #30) > EIP: 0060:[<c038ad65>] EFLAGS: 00210082 CPU: 0 > EIP is at __list_add+0x15/0x90 > EAX: f7455764 EBX: 00100100 ECX: 00100100 EDX: f7455764 > ESI: f7455764 EDI: f7455764 EBP: f25e7c18 ESP: f25e7bf4 > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 > Process bash (pid: 3597, ti=f25e6000 task=f25aa700 task.ti=f25e6000) > Stack: f68120bc f6c00400 c0687133 00000000 00000002 00000000 f7455734 f7455764 > f7455764 f25e7c40 c018b954 0000001f 00000000 f6c00400 f6c00444 00000001 > f68120bc f68120b0 00200002 f25e7cb4 c018d317 f68120bc 00000000 f25aac0c > Call Trace: > [<c0687133>] ? _spin_lock+0x63/0x70 Corruption in the page allocator freelists. > [<c018b954>] ? rmqueue_bulk+0x54/0x80 > [<c018d317>] ? get_page_from_freelist+0x5a7/0x720 > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > [<c018dd50>] ? __alloc_pages_internal+0xa0/0x450 > [<c01acd4b>] ? alloc_pages_current+0x7b/0xc0 > [<c01b37fb>] ? new_slab+0x1bb/0x2d0 > [<c0687877>] ? _spin_unlock+0x27/0x50 > [<c01b40ca>] ? __slab_alloc+0x32a/0x4e0 > [<c010b335>] ? native_sched_clock+0xb5/0x110 > [<c01b4424>] ? kmem_cache_alloc+0xb4/0xe0 > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > [<c01897a1>] ? mempool_alloc+0x31/0xf0 > [<c015e884>] ? trace_hardirqs_on_caller+0xd4/0x160 > [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 > [<c0368c7e>] ? get_request+0xae/0x2c0 > [<c036935c>] ? get_request_wait+0x1c/0xd0 > [<c0687462>] ? _spin_lock_irq+0x72/0x80 > [<c0369442>] ? blk_get_request+0x32/0x70 > [<c0471c1c>] ? generic_ide_resume+0x5c/0xf0 > [<c03f8bde>] ? device_resume+0x32e/0x380 > [<c0168791>] ? hibernation_snapshot+0xa1/0x220 > [<c013b55b>] ? printk+0x1b/0x20 > [<c01689f0>] ? hibernate+0xe0/0x180 > [<c01674a0>] ? state_store+0x0/0xd0 > [<c016755f>] ? state_store+0xbf/0xd0 > [<c01674a0>] ? state_store+0x0/0xd0 > [<c0375ef4>] ? kobj_attr_store+0x24/0x30 > [<c01fa432>] ? sysfs_write_file+0xa2/0x100 > [<c01bbf06>] ? vfs_write+0x96/0x130 > [<c01fa390>] ? sysfs_write_file+0x0/0x100 > [<c01bc44d>] ? sys_write+0x3d/0x70 > [<c0104f3b>] ? sysenter_do_call+0x12/0x3f > ======================= > Code: c0 e8 d0 f7 da ff 8b 13 eb 97 8d b6 00 00 00 00 8d bf 00 00 00 > 00 55 89 e5 83 ec 24 89 5d f4 89 cb 89 75 f8 89 d6 89 7d fc 89 c7 <8b> > 41 04 39 d0 75 1d 8b 06 39 d8 75 41 89 7b 04 89 1f 8b 5d f4 > EIP: [<c038ad65>] __list_add+0x15/0x90 SS:ESP 0068:f25e7bf4 > ---[ end trace e3ed674f2f20c5d3 ]--- > note: bash[3597] exited with preempt_count 2 > Eeek! page_mapcount(page) went negative! (-1) > page pfn = 3e74f > page->flags = 210007c > page->count = 1 > page->mapping = f6439028 > vma->vm_ops = generic_file_vm_ops+0x0/0x20 > vma->vm_ops->fault = filemap_fault+0x0/0x480 > vma->vm_file->f_op->mmap = generic_file_mmap+0x0/0x50 > ------------[ cut here ]------------ > kernel BUG at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/mm/rmap.c:662! > invalid opcode: 0000 [#2] PREEMPT SMP DEBUG_PAGEALLOC > Pid: 3597, comm: bash Tainted: G D (2.6.27-rc4-00003-ga798564-dirty #30) > EIP: 0060:[<c01a0f39>] EFLAGS: 00210286 CPU: 0 > EIP is at page_remove_rmap+0x109/0x120 > EAX: 0000003b EBX: f7aa5684 ECX: f25e6000 EDX: 00000005 > ESI: f25b94c8 EDI: 3e74f025 EBP: f25e7990 ESP: f25e7980 > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 > Process bash (pid: 3597, ti=f25e6000 task=f25aa700 task.ti=f25e6000) > Stack: c077ed7c f6439028 00000000 00000025 f25e7a38 c0198721 3e74f025 00000000 > f25e79c0 c015c5ea 00000025 00000000 325a3067 00000000 00000000 f25b94c8 > f25e7a50 00000000 00000000 3e74f025 00000000 00007000 0097a000 00000000 > Call Trace: No idea what happened here but it looks bad :-) > [<c0198721>] ? unmap_vmas+0x4b1/0x8b0 > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > [<c019d504>] ? exit_mmap+0x84/0x120 > [<c0138538>] ? mmput+0x48/0xa0 > [<c013c3d7>] ? exit_mm+0xe7/0x110 > [<c013d7a4>] ? do_exit+0x184/0x890 > [<c013b55b>] ? printk+0x1b/0x20 > [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 > [<c01060f1>] ? oops_end+0xb1/0xc0 > [<c01067c0>] ? die+0x50/0x70 > [<c0122b4f>] ? do_page_fault+0x1ef/0xa20 > [<c010b335>] ? native_sched_clock+0xb5/0x110 > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > [<c0122960>] ? do_page_fault+0x0/0xa20 > [<c0687d3a>] ? error_code+0x72/0x78 > [<c038ad65>] ? __list_add+0x15/0x90 > [<c0687133>] ? _spin_lock+0x63/0x70 > [<c018b954>] ? rmqueue_bulk+0x54/0x80 > [<c018d317>] ? get_page_from_freelist+0x5a7/0x720 > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > [<c018dd50>] ? __alloc_pages_internal+0xa0/0x450 > [<c01acd4b>] ? alloc_pages_current+0x7b/0xc0 > [<c01b37fb>] ? new_slab+0x1bb/0x2d0 > [<c0687877>] ? _spin_unlock+0x27/0x50 > [<c01b40ca>] ? __slab_alloc+0x32a/0x4e0 > [<c010b335>] ? native_sched_clock+0xb5/0x110 > [<c01b4424>] ? kmem_cache_alloc+0xb4/0xe0 > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > [<c01897a1>] ? mempool_alloc+0x31/0xf0 > [<c015e884>] ? trace_hardirqs_on_caller+0xd4/0x160 > [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 > [<c0368c7e>] ? get_request+0xae/0x2c0 > [<c036935c>] ? get_request_wait+0x1c/0xd0 > [<c0687462>] ? _spin_lock_irq+0x72/0x80 > [<c0369442>] ? blk_get_request+0x32/0x70 > [<c0471c1c>] ? generic_ide_resume+0x5c/0xf0 > [<c03f8bde>] ? device_resume+0x32e/0x380 > [<c0168791>] ? hibernation_snapshot+0xa1/0x220 > [<c013b55b>] ? printk+0x1b/0x20 > [<c01689f0>] ? hibernate+0xe0/0x180 > [<c01674a0>] ? state_store+0x0/0xd0 > [<c016755f>] ? state_store+0xbf/0xd0 > [<c01674a0>] ? state_store+0x0/0xd0 > [<c0375ef4>] ? kobj_attr_store+0x24/0x30 > [<c01fa432>] ? sysfs_write_file+0xa2/0x100 > [<c01bbf06>] ? vfs_write+0x96/0x130 > [<c01fa390>] ? sysfs_write_file+0x0/0x100 > [<c01bc44d>] ? sys_write+0x3d/0x70 > [<c0104f3b>] ? sysenter_do_call+0x12/0x3f > ======================= > Code: c0 74 0d 8b 50 08 b8 ac ed 77 c0 e8 82 60 fc ff 8b 46 4c 85 c0 > 74 14 8b 40 10 85 c0 74 0d 8b 50 2c b8 d8 d1 77 c0 e8 67 60 fc ff <0f> > 0b eb fe 8b 53 0c eb 95 8d b4 26 00 00 00 00 8d bc 27 00 00 > EIP: [<c01a0f39>] page_remove_rmap+0x109/0x120 SS:ESP 0068:f25e7980 > ---[ end trace e3ed674f2f20c5d3 ]--- > Fixing recursive fault but reboot is needed! > ============================================================================= > BUG blkdev_ioc: Invalid object pointer 0xf5cdaca8 > ----------------------------------------------------------------------------- Ok, here we have the block layer passing a bad pointer to SLUB this time. And it's also from the suspend code (although it's the resume path this time). As we never see an oops from the block layer first, it's possible that someone else corrupted everything and it just shows up in the block layer. Maybe something worth investigating, though. > INFO: Slab 0xf789e318 objects=14 used=14 fp=0x00000000 flags=0x2082083 > Pid: 3597, comm: bash Tainted: G D 2.6.27-rc4-00003-ga798564-dirty #30 > [<c01b2576>] slab_err+0x46/0x50 > [<c01b2766>] ? check_slab+0xd6/0xf0 > [<c0181aef>] ? call_rcu+0x6f/0x80 > [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 > [<c01b3c78>] __slab_free+0x238/0x360 > [<c01b4749>] kmem_cache_free+0xa9/0x120 > [<c036b773>] ? put_io_context+0x53/0x70 > [<c036b773>] ? put_io_context+0x53/0x70 > [<c036b773>] put_io_context+0x53/0x70 > [<c036b82e>] exit_io_context+0x6e/0x80 > [<c013de6e>] do_exit+0x84e/0x890 > [<c037b794>] ? trace_hardirqs_on_thunk+0xc/0x10 > [<c013b55b>] ? printk+0x1b/0x20 > [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 > [<c01060f1>] oops_end+0xb1/0xc0 > [<c01067c0>] die+0x50/0x70 > [<c0106871>] do_trap+0x91/0xc0 > [<c0106940>] ? do_invalid_op+0x0/0xa0 > [<c01069c8>] do_invalid_op+0x88/0xa0 > [<c01a0f39>] ? page_remove_rmap+0x109/0x120 > [<c013b2d1>] ? vprintk+0x151/0x3c0 > [<c013b45b>] ? vprintk+0x2db/0x3c0 > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > [<c0687d3a>] error_code+0x72/0x78 > [<c013007b>] ? sched_rt_period_timer+0x21b/0x270 > [<c01a0f39>] ? page_remove_rmap+0x109/0x120 > [<c0198721>] unmap_vmas+0x4b1/0x8b0 > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > [<c019d504>] exit_mmap+0x84/0x120 > [<c0138538>] mmput+0x48/0xa0 > [<c013c3d7>] exit_mm+0xe7/0x110 > [<c013d7a4>] do_exit+0x184/0x890 > [<c013b55b>] ? printk+0x1b/0x20 > [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 > [<c01060f1>] oops_end+0xb1/0xc0 > [<c01067c0>] die+0x50/0x70 > [<c0122b4f>] do_page_fault+0x1ef/0xa20 > [<c010b335>] ? native_sched_clock+0xb5/0x110 > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > [<c0122960>] ? do_page_fault+0x0/0xa20 > [<c0687d3a>] error_code+0x72/0x78 > [<c038ad65>] ? __list_add+0x15/0x90 > [<c0687133>] ? _spin_lock+0x63/0x70 > [<c018b954>] rmqueue_bulk+0x54/0x80 > [<c018d317>] get_page_from_freelist+0x5a7/0x720 > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > [<c018dd50>] __alloc_pages_internal+0xa0/0x450 > [<c01acd4b>] alloc_pages_current+0x7b/0xc0 > [<c01b37fb>] new_slab+0x1bb/0x2d0 > [<c0687877>] ? _spin_unlock+0x27/0x50 > [<c01b40ca>] __slab_alloc+0x32a/0x4e0 > [<c010b335>] ? native_sched_clock+0xb5/0x110 > [<c01b4424>] kmem_cache_alloc+0xb4/0xe0 > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > [<c018969e>] mempool_alloc_slab+0xe/0x10 > [<c01897a1>] mempool_alloc+0x31/0xf0 > [<c015e884>] ? trace_hardirqs_on_caller+0xd4/0x160 > [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 > [<c0368c7e>] get_request+0xae/0x2c0 > [<c036935c>] get_request_wait+0x1c/0xd0 > [<c0687462>] ? _spin_lock_irq+0x72/0x80 > [<c0369442>] blk_get_request+0x32/0x70 > [<c0471c1c>] generic_ide_resume+0x5c/0xf0 > [<c03f8bde>] device_resume+0x32e/0x380 > [<c0168791>] hibernation_snapshot+0xa1/0x220 > [<c013b55b>] ? printk+0x1b/0x20 > [<c01689f0>] hibernate+0xe0/0x180 > [<c01674a0>] ? state_store+0x0/0xd0 > [<c016755f>] state_store+0xbf/0xd0 > [<c01674a0>] ? state_store+0x0/0xd0 > [<c0375ef4>] kobj_attr_store+0x24/0x30 > [<c01fa432>] sysfs_write_file+0xa2/0x100 > [<c01bbf06>] vfs_write+0x96/0x130 > [<c01fa390>] ? sysfs_write_file+0x0/0x100 > [<c01bc44d>] sys_write+0x3d/0x70 > [<c0104f3b>] sysenter_do_call+0x12/0x3f > ======================= > FIX blkdev_ioc: Object at 0xf5cdaca8 not freed > BUG: scheduling while atomic: bash/3597/0x00000006 > INFO: lockdep is turned off. > Pid: 3597, comm: bash Tainted: G D 2.6.27-rc4-00003-ga798564-dirty #30 > [<c0135467>] __schedule_bug+0x77/0x80 > [<c0684ce2>] schedule+0x852/0x8f0 > [<c010509e>] ? restore_nocheck_notrace+0x0/0xe > [<c01b4779>] ? kmem_cache_free+0xd9/0x120 > [<c036b773>] ? put_io_context+0x53/0x70 > [<c036b773>] ? put_io_context+0x53/0x70 > [<c013de81>] do_exit+0x861/0x890 > [<c037b794>] ? trace_hardirqs_on_thunk+0xc/0x10 > [<c013b55b>] ? printk+0x1b/0x20 > [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 > [<c01060f1>] oops_end+0xb1/0xc0 > [<c01067c0>] die+0x50/0x70 > [<c0106871>] do_trap+0x91/0xc0 > [<c0106940>] ? do_invalid_op+0x0/0xa0 > [<c01069c8>] do_invalid_op+0x88/0xa0 > [<c01a0f39>] ? page_remove_rmap+0x109/0x120 > [<c013b2d1>] ? vprintk+0x151/0x3c0 > [<c013b45b>] ? vprintk+0x2db/0x3c0 > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > [<c0687d3a>] error_code+0x72/0x78 > [<c013007b>] ? sched_rt_period_timer+0x21b/0x270 > [<c01a0f39>] ? page_remove_rmap+0x109/0x120 > [<c0198721>] unmap_vmas+0x4b1/0x8b0 > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > [<c019d504>] exit_mmap+0x84/0x120 > [<c0138538>] mmput+0x48/0xa0 > [<c013c3d7>] exit_mm+0xe7/0x110 > [<c013d7a4>] do_exit+0x184/0x890 > [<c013b55b>] ? printk+0x1b/0x20 > [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 > [<c01060f1>] oops_end+0xb1/0xc0 > [<c01067c0>] die+0x50/0x70 > [<c0122b4f>] do_page_fault+0x1ef/0xa20 > [<c010b335>] ? native_sched_clock+0xb5/0x110 > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > [<c0122960>] ? do_page_fault+0x0/0xa20 > [<c0687d3a>] error_code+0x72/0x78 > [<c038ad65>] ? __list_add+0x15/0x90 > [<c0687133>] ? _spin_lock+0x63/0x70 > [<c018b954>] rmqueue_bulk+0x54/0x80 > [<c018d317>] get_page_from_freelist+0x5a7/0x720 > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > [<c018dd50>] __alloc_pages_internal+0xa0/0x450 > [<c01acd4b>] alloc_pages_current+0x7b/0xc0 > [<c01b37fb>] new_slab+0x1bb/0x2d0 > [<c0687877>] ? _spin_unlock+0x27/0x50 > [<c01b40ca>] __slab_alloc+0x32a/0x4e0 > [<c010b335>] ? native_sched_clock+0xb5/0x110 > [<c01b4424>] kmem_cache_alloc+0xb4/0xe0 > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > [<c018969e>] mempool_alloc_slab+0xe/0x10 > [<c01897a1>] mempool_alloc+0x31/0xf0 > [<c015e884>] ? trace_hardirqs_on_caller+0xd4/0x160 > [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 > [<c0368c7e>] get_request+0xae/0x2c0 > [<c036935c>] get_request_wait+0x1c/0xd0 > [<c0687462>] ? _spin_lock_irq+0x72/0x80 > [<c0369442>] blk_get_request+0x32/0x70 > [<c0471c1c>] generic_ide_resume+0x5c/0xf0 > [<c03f8bde>] device_resume+0x32e/0x380 > [<c0168791>] hibernation_snapshot+0xa1/0x220 > [<c013b55b>] ? printk+0x1b/0x20 > [<c01689f0>] hibernate+0xe0/0x180 > [<c01674a0>] ? state_store+0x0/0xd0 > [<c016755f>] state_store+0xbf/0xd0 > [<c01674a0>] ? state_store+0x0/0xd0 > [<c0375ef4>] kobj_attr_store+0x24/0x30 > [<c01fa432>] sysfs_write_file+0xa2/0x100 > > I can look up addresses in the vmlinux for accurate line numbers if needed. > > > Vegard > > -- > "The animistic metaphor of the bug that maliciously sneaked in while > the programmer was not looking is intellectually dishonest as it > disguises that the error is the programmer's own creation." > -- E. W. Dijkstra, EWD1036 > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-21 21:27 ` Pekka Enberg @ 2008-08-21 22:16 ` Rafael J. Wysocki 2008-08-22 4:45 ` Vegard Nossum 0 siblings, 1 reply; 21+ messages in thread From: Rafael J. Wysocki @ 2008-08-21 22:16 UTC (permalink / raw) To: Pekka Enberg, Vegard Nossum Cc: Linux Kernel Mailing List, Andrew Morton, Jens Axboe, Bartlomiej Zolnierkiewicz On Thursday, 21 of August 2008, Pekka Enberg wrote: > On Thu, Aug 21, 2008 at 9:21 PM, Vegard Nossum <vegard.nossum@gmail.com> wrote: > > I got a "proper" oops this time (brace yourself, it's long ;-)), and > > it does not have serial on the stack: > > > > BUG: unable to handle kernel paging request at 00100104 > > IP: [<c038ad65>] __list_add+0x15/0x90 > > *pdpt = 00000000325bb001 *pde = 0000000000000000 > > Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > > Pid: 3597, comm: bash Not tainted (2.6.27-rc4-00003-ga798564-dirty #30) > > EIP: 0060:[<c038ad65>] EFLAGS: 00210082 CPU: 0 > > EIP is at __list_add+0x15/0x90 > > EAX: f7455764 EBX: 00100100 ECX: 00100100 EDX: f7455764 > > ESI: f7455764 EDI: f7455764 EBP: f25e7c18 ESP: f25e7bf4 > > DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 > > Process bash (pid: 3597, ti=f25e6000 task=f25aa700 task.ti=f25e6000) > > Stack: f68120bc f6c00400 c0687133 00000000 00000002 00000000 f7455734 f7455764 > > f7455764 f25e7c40 c018b954 0000001f 00000000 f6c00400 f6c00444 00000001 > > f68120bc f68120b0 00200002 f25e7cb4 c018d317 f68120bc 00000000 f25aac0c > > Call Trace: > > [<c0687133>] ? _spin_lock+0x63/0x70 > > Corruption in the page allocator freelists. > > > [<c018b954>] ? rmqueue_bulk+0x54/0x80 > > [<c018d317>] ? get_page_from_freelist+0x5a7/0x720 > > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > > [<c018dd50>] ? __alloc_pages_internal+0xa0/0x450 > > [<c01acd4b>] ? alloc_pages_current+0x7b/0xc0 > > [<c01b37fb>] ? new_slab+0x1bb/0x2d0 > > [<c0687877>] ? _spin_unlock+0x27/0x50 > > [<c01b40ca>] ? __slab_alloc+0x32a/0x4e0 > > [<c010b335>] ? native_sched_clock+0xb5/0x110 > > [<c01b4424>] ? kmem_cache_alloc+0xb4/0xe0 > > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > > [<c01897a1>] ? mempool_alloc+0x31/0xf0 > > [<c015e884>] ? trace_hardirqs_on_caller+0xd4/0x160 > > [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 > > [<c0368c7e>] ? get_request+0xae/0x2c0 > > [<c036935c>] ? get_request_wait+0x1c/0xd0 > > [<c0687462>] ? _spin_lock_irq+0x72/0x80 > > [<c0369442>] ? blk_get_request+0x32/0x70 > > [<c0471c1c>] ? generic_ide_resume+0x5c/0xf0 > > [<c03f8bde>] ? device_resume+0x32e/0x380 > > [<c0168791>] ? hibernation_snapshot+0xa1/0x220 > > [<c013b55b>] ? printk+0x1b/0x20 > > [<c01689f0>] ? hibernate+0xe0/0x180 > > [<c01674a0>] ? state_store+0x0/0xd0 > > [<c016755f>] ? state_store+0xbf/0xd0 > > [<c01674a0>] ? state_store+0x0/0xd0 > > [<c0375ef4>] ? kobj_attr_store+0x24/0x30 > > [<c01fa432>] ? sysfs_write_file+0xa2/0x100 > > [<c01bbf06>] ? vfs_write+0x96/0x130 > > [<c01fa390>] ? sysfs_write_file+0x0/0x100 > > [<c01bc44d>] ? sys_write+0x3d/0x70 > > [<c0104f3b>] ? sysenter_do_call+0x12/0x3f > > ======================= > > Code: c0 e8 d0 f7 da ff 8b 13 eb 97 8d b6 00 00 00 00 8d bf 00 00 00 > > 00 55 89 e5 83 ec 24 89 5d f4 89 cb 89 75 f8 89 d6 89 7d fc 89 c7 <8b> > > 41 04 39 d0 75 1d 8b 06 39 d8 75 41 89 7b 04 89 1f 8b 5d f4 > > EIP: [<c038ad65>] __list_add+0x15/0x90 SS:ESP 0068:f25e7bf4 > > ---[ end trace e3ed674f2f20c5d3 ]--- > > note: bash[3597] exited with preempt_count 2 > > Eeek! page_mapcount(page) went negative! (-1) > > page pfn = 3e74f > > page->flags = 210007c > > page->count = 1 > > page->mapping = f6439028 > > vma->vm_ops = generic_file_vm_ops+0x0/0x20 > > vma->vm_ops->fault = filemap_fault+0x0/0x480 > > vma->vm_file->f_op->mmap = generic_file_mmap+0x0/0x50 > > ------------[ cut here ]------------ > > kernel BUG at /uio/arkimedes/s29/vegardno/git-working/linux-2.6/mm/rmap.c:662! > > invalid opcode: 0000 [#2] PREEMPT SMP DEBUG_PAGEALLOC > > Pid: 3597, comm: bash Tainted: G D (2.6.27-rc4-00003-ga798564-dirty #30) > > EIP: 0060:[<c01a0f39>] EFLAGS: 00210286 CPU: 0 > > EIP is at page_remove_rmap+0x109/0x120 > > EAX: 0000003b EBX: f7aa5684 ECX: f25e6000 EDX: 00000005 > > ESI: f25b94c8 EDI: 3e74f025 EBP: f25e7990 ESP: f25e7980 > > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 > > Process bash (pid: 3597, ti=f25e6000 task=f25aa700 task.ti=f25e6000) > > Stack: c077ed7c f6439028 00000000 00000025 f25e7a38 c0198721 3e74f025 00000000 > > f25e79c0 c015c5ea 00000025 00000000 325a3067 00000000 00000000 f25b94c8 > > f25e7a50 00000000 00000000 3e74f025 00000000 00007000 0097a000 00000000 > > Call Trace: > > No idea what happened here but it looks bad :-) > > > [<c0198721>] ? unmap_vmas+0x4b1/0x8b0 > > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > > [<c019d504>] ? exit_mmap+0x84/0x120 > > [<c0138538>] ? mmput+0x48/0xa0 > > [<c013c3d7>] ? exit_mm+0xe7/0x110 > > [<c013d7a4>] ? do_exit+0x184/0x890 > > [<c013b55b>] ? printk+0x1b/0x20 > > [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 > > [<c01060f1>] ? oops_end+0xb1/0xc0 > > [<c01067c0>] ? die+0x50/0x70 > > [<c0122b4f>] ? do_page_fault+0x1ef/0xa20 > > [<c010b335>] ? native_sched_clock+0xb5/0x110 > > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > > [<c0122960>] ? do_page_fault+0x0/0xa20 > > [<c0687d3a>] ? error_code+0x72/0x78 > > [<c038ad65>] ? __list_add+0x15/0x90 > > [<c0687133>] ? _spin_lock+0x63/0x70 > > [<c018b954>] ? rmqueue_bulk+0x54/0x80 > > [<c018d317>] ? get_page_from_freelist+0x5a7/0x720 > > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > > [<c018dd50>] ? __alloc_pages_internal+0xa0/0x450 > > [<c01acd4b>] ? alloc_pages_current+0x7b/0xc0 > > [<c01b37fb>] ? new_slab+0x1bb/0x2d0 > > [<c0687877>] ? _spin_unlock+0x27/0x50 > > [<c01b40ca>] ? __slab_alloc+0x32a/0x4e0 > > [<c010b335>] ? native_sched_clock+0xb5/0x110 > > [<c01b4424>] ? kmem_cache_alloc+0xb4/0xe0 > > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > > [<c01897a1>] ? mempool_alloc+0x31/0xf0 > > [<c015e884>] ? trace_hardirqs_on_caller+0xd4/0x160 > > [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 > > [<c0368c7e>] ? get_request+0xae/0x2c0 > > [<c036935c>] ? get_request_wait+0x1c/0xd0 > > [<c0687462>] ? _spin_lock_irq+0x72/0x80 > > [<c0369442>] ? blk_get_request+0x32/0x70 > > [<c0471c1c>] ? generic_ide_resume+0x5c/0xf0 > > [<c03f8bde>] ? device_resume+0x32e/0x380 > > [<c0168791>] ? hibernation_snapshot+0xa1/0x220 > > [<c013b55b>] ? printk+0x1b/0x20 > > [<c01689f0>] ? hibernate+0xe0/0x180 > > [<c01674a0>] ? state_store+0x0/0xd0 > > [<c016755f>] ? state_store+0xbf/0xd0 > > [<c01674a0>] ? state_store+0x0/0xd0 > > [<c0375ef4>] ? kobj_attr_store+0x24/0x30 > > [<c01fa432>] ? sysfs_write_file+0xa2/0x100 > > [<c01bbf06>] ? vfs_write+0x96/0x130 > > [<c01fa390>] ? sysfs_write_file+0x0/0x100 > > [<c01bc44d>] ? sys_write+0x3d/0x70 > > [<c0104f3b>] ? sysenter_do_call+0x12/0x3f > > ======================= > > Code: c0 74 0d 8b 50 08 b8 ac ed 77 c0 e8 82 60 fc ff 8b 46 4c 85 c0 > > 74 14 8b 40 10 85 c0 74 0d 8b 50 2c b8 d8 d1 77 c0 e8 67 60 fc ff <0f> > > 0b eb fe 8b 53 0c eb 95 8d b4 26 00 00 00 00 8d bc 27 00 00 > > EIP: [<c01a0f39>] page_remove_rmap+0x109/0x120 SS:ESP 0068:f25e7980 > > ---[ end trace e3ed674f2f20c5d3 ]--- > > Fixing recursive fault but reboot is needed! > > ============================================================================= > > BUG blkdev_ioc: Invalid object pointer 0xf5cdaca8 > > ----------------------------------------------------------------------------- > > Ok, here we have the block layer passing a bad pointer to SLUB this > time. And it's also from the suspend code (although it's the resume > path this time). As we never see an oops from the block layer first, > it's possible that someone else corrupted everything and it just shows > up in the block layer. Maybe something worth investigating, though. > > > INFO: Slab 0xf789e318 objects=14 used=14 fp=0x00000000 flags=0x2082083 > > Pid: 3597, comm: bash Tainted: G D 2.6.27-rc4-00003-ga798564-dirty #30 > > [<c01b2576>] slab_err+0x46/0x50 > > [<c01b2766>] ? check_slab+0xd6/0xf0 > > [<c0181aef>] ? call_rcu+0x6f/0x80 > > [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 > > [<c01b3c78>] __slab_free+0x238/0x360 > > [<c01b4749>] kmem_cache_free+0xa9/0x120 > > [<c036b773>] ? put_io_context+0x53/0x70 > > [<c036b773>] ? put_io_context+0x53/0x70 > > [<c036b773>] put_io_context+0x53/0x70 > > [<c036b82e>] exit_io_context+0x6e/0x80 > > [<c013de6e>] do_exit+0x84e/0x890 > > [<c037b794>] ? trace_hardirqs_on_thunk+0xc/0x10 > > [<c013b55b>] ? printk+0x1b/0x20 > > [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 > > [<c01060f1>] oops_end+0xb1/0xc0 > > [<c01067c0>] die+0x50/0x70 > > [<c0106871>] do_trap+0x91/0xc0 > > [<c0106940>] ? do_invalid_op+0x0/0xa0 > > [<c01069c8>] do_invalid_op+0x88/0xa0 > > [<c01a0f39>] ? page_remove_rmap+0x109/0x120 > > [<c013b2d1>] ? vprintk+0x151/0x3c0 > > [<c013b45b>] ? vprintk+0x2db/0x3c0 > > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > > [<c0687d3a>] error_code+0x72/0x78 > > [<c013007b>] ? sched_rt_period_timer+0x21b/0x270 > > [<c01a0f39>] ? page_remove_rmap+0x109/0x120 > > [<c0198721>] unmap_vmas+0x4b1/0x8b0 > > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > > [<c019d504>] exit_mmap+0x84/0x120 > > [<c0138538>] mmput+0x48/0xa0 > > [<c013c3d7>] exit_mm+0xe7/0x110 > > [<c013d7a4>] do_exit+0x184/0x890 > > [<c013b55b>] ? printk+0x1b/0x20 > > [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 > > [<c01060f1>] oops_end+0xb1/0xc0 > > [<c01067c0>] die+0x50/0x70 > > [<c0122b4f>] do_page_fault+0x1ef/0xa20 > > [<c010b335>] ? native_sched_clock+0xb5/0x110 > > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > > [<c0122960>] ? do_page_fault+0x0/0xa20 > > [<c0687d3a>] error_code+0x72/0x78 > > [<c038ad65>] ? __list_add+0x15/0x90 > > [<c0687133>] ? _spin_lock+0x63/0x70 > > [<c018b954>] rmqueue_bulk+0x54/0x80 > > [<c018d317>] get_page_from_freelist+0x5a7/0x720 > > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > > [<c018dd50>] __alloc_pages_internal+0xa0/0x450 > > [<c01acd4b>] alloc_pages_current+0x7b/0xc0 > > [<c01b37fb>] new_slab+0x1bb/0x2d0 > > [<c0687877>] ? _spin_unlock+0x27/0x50 > > [<c01b40ca>] __slab_alloc+0x32a/0x4e0 > > [<c010b335>] ? native_sched_clock+0xb5/0x110 > > [<c01b4424>] kmem_cache_alloc+0xb4/0xe0 > > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > > [<c018969e>] mempool_alloc_slab+0xe/0x10 > > [<c01897a1>] mempool_alloc+0x31/0xf0 > > [<c015e884>] ? trace_hardirqs_on_caller+0xd4/0x160 > > [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 > > [<c0368c7e>] get_request+0xae/0x2c0 > > [<c036935c>] get_request_wait+0x1c/0xd0 > > [<c0687462>] ? _spin_lock_irq+0x72/0x80 > > [<c0369442>] blk_get_request+0x32/0x70 > > [<c0471c1c>] generic_ide_resume+0x5c/0xf0 IDE again? Vegard, this is piix, isn't it? > > [<c03f8bde>] device_resume+0x32e/0x380 > > [<c0168791>] hibernation_snapshot+0xa1/0x220 > > [<c013b55b>] ? printk+0x1b/0x20 > > [<c01689f0>] hibernate+0xe0/0x180 > > [<c01674a0>] ? state_store+0x0/0xd0 > > [<c016755f>] state_store+0xbf/0xd0 > > [<c01674a0>] ? state_store+0x0/0xd0 > > [<c0375ef4>] kobj_attr_store+0x24/0x30 > > [<c01fa432>] sysfs_write_file+0xa2/0x100 > > [<c01bbf06>] vfs_write+0x96/0x130 > > [<c01fa390>] ? sysfs_write_file+0x0/0x100 > > [<c01bc44d>] sys_write+0x3d/0x70 > > [<c0104f3b>] sysenter_do_call+0x12/0x3f > > ======================= > > FIX blkdev_ioc: Object at 0xf5cdaca8 not freed > > BUG: scheduling while atomic: bash/3597/0x00000006 > > INFO: lockdep is turned off. > > Pid: 3597, comm: bash Tainted: G D 2.6.27-rc4-00003-ga798564-dirty #30 > > [<c0135467>] __schedule_bug+0x77/0x80 > > [<c0684ce2>] schedule+0x852/0x8f0 > > [<c010509e>] ? restore_nocheck_notrace+0x0/0xe > > [<c01b4779>] ? kmem_cache_free+0xd9/0x120 > > [<c036b773>] ? put_io_context+0x53/0x70 > > [<c036b773>] ? put_io_context+0x53/0x70 > > [<c013de81>] do_exit+0x861/0x890 > > [<c037b794>] ? trace_hardirqs_on_thunk+0xc/0x10 > > [<c013b55b>] ? printk+0x1b/0x20 > > [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 > > [<c01060f1>] oops_end+0xb1/0xc0 > > [<c01067c0>] die+0x50/0x70 > > [<c0106871>] do_trap+0x91/0xc0 > > [<c0106940>] ? do_invalid_op+0x0/0xa0 > > [<c01069c8>] do_invalid_op+0x88/0xa0 > > [<c01a0f39>] ? page_remove_rmap+0x109/0x120 > > [<c013b2d1>] ? vprintk+0x151/0x3c0 > > [<c013b45b>] ? vprintk+0x2db/0x3c0 > > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > > [<c0687d3a>] error_code+0x72/0x78 > > [<c013007b>] ? sched_rt_period_timer+0x21b/0x270 > > [<c01a0f39>] ? page_remove_rmap+0x109/0x120 > > [<c0198721>] unmap_vmas+0x4b1/0x8b0 > > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > > [<c019d504>] exit_mmap+0x84/0x120 > > [<c0138538>] mmput+0x48/0xa0 > > [<c013c3d7>] exit_mm+0xe7/0x110 > > [<c013d7a4>] do_exit+0x184/0x890 > > [<c013b55b>] ? printk+0x1b/0x20 > > [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 > > [<c01060f1>] oops_end+0xb1/0xc0 > > [<c01067c0>] die+0x50/0x70 > > [<c0122b4f>] do_page_fault+0x1ef/0xa20 > > [<c010b335>] ? native_sched_clock+0xb5/0x110 > > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > > [<c0122960>] ? do_page_fault+0x0/0xa20 > > [<c0687d3a>] error_code+0x72/0x78 > > [<c038ad65>] ? __list_add+0x15/0x90 > > [<c0687133>] ? _spin_lock+0x63/0x70 > > [<c018b954>] rmqueue_bulk+0x54/0x80 > > [<c018d317>] get_page_from_freelist+0x5a7/0x720 > > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > > [<c018dd50>] __alloc_pages_internal+0xa0/0x450 > > [<c01acd4b>] alloc_pages_current+0x7b/0xc0 > > [<c01b37fb>] new_slab+0x1bb/0x2d0 > > [<c0687877>] ? _spin_unlock+0x27/0x50 > > [<c01b40ca>] __slab_alloc+0x32a/0x4e0 > > [<c010b335>] ? native_sched_clock+0xb5/0x110 > > [<c01b4424>] kmem_cache_alloc+0xb4/0xe0 > > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > > [<c018969e>] mempool_alloc_slab+0xe/0x10 > > [<c01897a1>] mempool_alloc+0x31/0xf0 > > [<c015e884>] ? trace_hardirqs_on_caller+0xd4/0x160 > > [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 > > [<c0368c7e>] get_request+0xae/0x2c0 > > [<c036935c>] get_request_wait+0x1c/0xd0 > > [<c0687462>] ? _spin_lock_irq+0x72/0x80 > > [<c0369442>] blk_get_request+0x32/0x70 > > [<c0471c1c>] generic_ide_resume+0x5c/0xf0 > > [<c03f8bde>] device_resume+0x32e/0x380 > > [<c0168791>] hibernation_snapshot+0xa1/0x220 > > [<c013b55b>] ? printk+0x1b/0x20 > > [<c01689f0>] hibernate+0xe0/0x180 > > [<c01674a0>] ? state_store+0x0/0xd0 > > [<c016755f>] state_store+0xbf/0xd0 > > [<c01674a0>] ? state_store+0x0/0xd0 > > [<c0375ef4>] kobj_attr_store+0x24/0x30 > > [<c01fa432>] sysfs_write_file+0xa2/0x100 > > > > I can look up addresses in the vmlinux for accurate line numbers if needed. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-21 22:16 ` Rafael J. Wysocki @ 2008-08-22 4:45 ` Vegard Nossum 2008-08-22 9:28 ` Pavel Machek 2008-08-22 9:34 ` Rafael J. Wysocki 0 siblings, 2 replies; 21+ messages in thread From: Vegard Nossum @ 2008-08-22 4:45 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Pekka Enberg, Linux Kernel Mailing List, Andrew Morton, Jens Axboe, Bartlomiej Zolnierkiewicz On Fri, Aug 22, 2008 at 12:16 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > On Thursday, 21 of August 2008, Pekka Enberg wrote: >> > ============================================================================= >> > BUG blkdev_ioc: Invalid object pointer 0xf5cdaca8 >> > ----------------------------------------------------------------------------- >> >> Ok, here we have the block layer passing a bad pointer to SLUB this >> time. And it's also from the suspend code (although it's the resume >> path this time). As we never see an oops from the block layer first, >> it's possible that someone else corrupted everything and it just shows >> up in the block layer. Maybe something worth investigating, though. >> >> > INFO: Slab 0xf789e318 objects=14 used=14 fp=0x00000000 flags=0x2082083 >> > Pid: 3597, comm: bash Tainted: G D 2.6.27-rc4-00003-ga798564-dirty #30 >> > [<c01b2576>] slab_err+0x46/0x50 >> > [<c01b2766>] ? check_slab+0xd6/0xf0 >> > [<c0181aef>] ? call_rcu+0x6f/0x80 >> > [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 >> > [<c01b3c78>] __slab_free+0x238/0x360 >> > [<c01b4749>] kmem_cache_free+0xa9/0x120 >> > [<c036b773>] ? put_io_context+0x53/0x70 >> > [<c036b773>] ? put_io_context+0x53/0x70 >> > [<c036b773>] put_io_context+0x53/0x70 >> > [<c036b82e>] exit_io_context+0x6e/0x80 >> > [<c013de6e>] do_exit+0x84e/0x890 >> > [<c037b794>] ? trace_hardirqs_on_thunk+0xc/0x10 >> > [<c013b55b>] ? printk+0x1b/0x20 >> > [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 >> > [<c01060f1>] oops_end+0xb1/0xc0 >> > [<c01067c0>] die+0x50/0x70 >> > [<c0106871>] do_trap+0x91/0xc0 >> > [<c0106940>] ? do_invalid_op+0x0/0xa0 >> > [<c01069c8>] do_invalid_op+0x88/0xa0 >> > [<c01a0f39>] ? page_remove_rmap+0x109/0x120 >> > [<c013b2d1>] ? vprintk+0x151/0x3c0 >> > [<c013b45b>] ? vprintk+0x2db/0x3c0 >> > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 >> > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 >> > [<c0687d3a>] error_code+0x72/0x78 >> > [<c013007b>] ? sched_rt_period_timer+0x21b/0x270 >> > [<c01a0f39>] ? page_remove_rmap+0x109/0x120 >> > [<c0198721>] unmap_vmas+0x4b1/0x8b0 >> > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 >> > [<c019d504>] exit_mmap+0x84/0x120 >> > [<c0138538>] mmput+0x48/0xa0 >> > [<c013c3d7>] exit_mm+0xe7/0x110 >> > [<c013d7a4>] do_exit+0x184/0x890 >> > [<c013b55b>] ? printk+0x1b/0x20 >> > [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 >> > [<c01060f1>] oops_end+0xb1/0xc0 >> > [<c01067c0>] die+0x50/0x70 >> > [<c0122b4f>] do_page_fault+0x1ef/0xa20 >> > [<c010b335>] ? native_sched_clock+0xb5/0x110 >> > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 >> > [<c0122960>] ? do_page_fault+0x0/0xa20 >> > [<c0687d3a>] error_code+0x72/0x78 >> > [<c038ad65>] ? __list_add+0x15/0x90 >> > [<c0687133>] ? _spin_lock+0x63/0x70 >> > [<c018b954>] rmqueue_bulk+0x54/0x80 >> > [<c018d317>] get_page_from_freelist+0x5a7/0x720 >> > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 >> > [<c018dd50>] __alloc_pages_internal+0xa0/0x450 >> > [<c01acd4b>] alloc_pages_current+0x7b/0xc0 >> > [<c01b37fb>] new_slab+0x1bb/0x2d0 >> > [<c0687877>] ? _spin_unlock+0x27/0x50 >> > [<c01b40ca>] __slab_alloc+0x32a/0x4e0 >> > [<c010b335>] ? native_sched_clock+0xb5/0x110 >> > [<c01b4424>] kmem_cache_alloc+0xb4/0xe0 >> > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 >> > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 >> > [<c018969e>] mempool_alloc_slab+0xe/0x10 >> > [<c01897a1>] mempool_alloc+0x31/0xf0 >> > [<c015e884>] ? trace_hardirqs_on_caller+0xd4/0x160 >> > [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 >> > [<c0368c7e>] get_request+0xae/0x2c0 >> > [<c036935c>] get_request_wait+0x1c/0xd0 >> > [<c0687462>] ? _spin_lock_irq+0x72/0x80 >> > [<c0369442>] blk_get_request+0x32/0x70 >> > [<c0471c1c>] generic_ide_resume+0x5c/0xf0 > > IDE again? > > Vegard, this is piix, isn't it? If this makes it so, then yes: calling piix_ide_init+0x0/0xb0 initcall piix_ide_init+0x0/0xb0 returned 0 after 0 msecs calling ide_scan_pcibus+0x0/0xf0 piix 0000:00:1f.1: IDE controller (0x8086:0x27df rev 0x01) piix 0000:00:1f.1: IDE port disabled piix 0000:00:1f.1: not 100% native mode: will probe irqs later ide0: BM-DMA at 0xffa0-0xffa7 Probing IDE interface ide0... hda: WDC WD1600BB-00DAA3, ATA DISK drive hda: host max PIO4 wanted PIO255(auto-tune) selected PIO4 hda: UDMA/100 mode selected ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 It is also interesting that you mention that; this is from an earlier run (before serial console was working properly): > In my last run, I managed to get a lot of ascii art on the screen, but > also one line which gave me the EIP of the oops: > $ addr2line -e vmlinux -i c03724f3 > block/cfq-iosched.c:1190 > /* > * Must always be called with the rcu_read_lock() held > */ > static void > __call_for_each_cic(struct io_context *ioc, > void (*func)(struct io_context *, struct cfq_io_context *)) > { > struct cfq_io_context *cic; > struct hlist_node *n; > > hlist_for_each_entry_rcu(cic, n, &ioc->cic_list, cic_list) <-- here > func(ioc, cic); > } Vegard -- "The animistic metaphor of the bug that maliciously sneaked in while the programmer was not looking is intellectually dishonest as it disguises that the error is the programmer's own creation." -- E. W. Dijkstra, EWD1036 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-22 4:45 ` Vegard Nossum @ 2008-08-22 9:28 ` Pavel Machek 2008-08-22 9:34 ` Rafael J. Wysocki 1 sibling, 0 replies; 21+ messages in thread From: Pavel Machek @ 2008-08-22 9:28 UTC (permalink / raw) To: Vegard Nossum Cc: Rafael J. Wysocki, Pekka Enberg, Linux Kernel Mailing List, Andrew Morton, Jens Axboe, Bartlomiej Zolnierkiewicz Hi! > >> Ok, here we have the block layer passing a bad pointer to SLUB this > >> time. And it's also from the suspend code (although it's the resume > >> path this time). As we never see an oops from the block layer first, > >> it's possible that someone else corrupted everything and it just shows > >> up in the block layer. Maybe something worth investigating, > >> though. Do you use any unusual drivers? WHat kind of machine is that? Can you try with minimal drivers? Can you try with minimum config? Like smp disabled...? Does echo test > /sys/power/state work? Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-22 4:45 ` Vegard Nossum 2008-08-22 9:28 ` Pavel Machek @ 2008-08-22 9:34 ` Rafael J. Wysocki 2008-08-22 10:01 ` Bartlomiej Zolnierkiewicz 1 sibling, 1 reply; 21+ messages in thread From: Rafael J. Wysocki @ 2008-08-22 9:34 UTC (permalink / raw) To: Vegard Nossum Cc: Pekka Enberg, Linux Kernel Mailing List, Andrew Morton, Jens Axboe, Bartlomiej Zolnierkiewicz On Friday, 22 of August 2008, Vegard Nossum wrote: > On Fri, Aug 22, 2008 at 12:16 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > On Thursday, 21 of August 2008, Pekka Enberg wrote: > >> > ============================================================================= > >> > BUG blkdev_ioc: Invalid object pointer 0xf5cdaca8 > >> > ----------------------------------------------------------------------------- > >> > >> Ok, here we have the block layer passing a bad pointer to SLUB this > >> time. And it's also from the suspend code (although it's the resume > >> path this time). As we never see an oops from the block layer first, > >> it's possible that someone else corrupted everything and it just shows > >> up in the block layer. Maybe something worth investigating, though. > >> > >> > INFO: Slab 0xf789e318 objects=14 used=14 fp=0x00000000 flags=0x2082083 > >> > Pid: 3597, comm: bash Tainted: G D 2.6.27-rc4-00003-ga798564-dirty #30 > >> > [<c01b2576>] slab_err+0x46/0x50 > >> > [<c01b2766>] ? check_slab+0xd6/0xf0 > >> > [<c0181aef>] ? call_rcu+0x6f/0x80 > >> > [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 > >> > [<c01b3c78>] __slab_free+0x238/0x360 > >> > [<c01b4749>] kmem_cache_free+0xa9/0x120 > >> > [<c036b773>] ? put_io_context+0x53/0x70 > >> > [<c036b773>] ? put_io_context+0x53/0x70 > >> > [<c036b773>] put_io_context+0x53/0x70 > >> > [<c036b82e>] exit_io_context+0x6e/0x80 > >> > [<c013de6e>] do_exit+0x84e/0x890 > >> > [<c037b794>] ? trace_hardirqs_on_thunk+0xc/0x10 > >> > [<c013b55b>] ? printk+0x1b/0x20 > >> > [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 > >> > [<c01060f1>] oops_end+0xb1/0xc0 > >> > [<c01067c0>] die+0x50/0x70 > >> > [<c0106871>] do_trap+0x91/0xc0 > >> > [<c0106940>] ? do_invalid_op+0x0/0xa0 > >> > [<c01069c8>] do_invalid_op+0x88/0xa0 > >> > [<c01a0f39>] ? page_remove_rmap+0x109/0x120 > >> > [<c013b2d1>] ? vprintk+0x151/0x3c0 > >> > [<c013b45b>] ? vprintk+0x2db/0x3c0 > >> > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > >> > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > >> > [<c0687d3a>] error_code+0x72/0x78 > >> > [<c013007b>] ? sched_rt_period_timer+0x21b/0x270 > >> > [<c01a0f39>] ? page_remove_rmap+0x109/0x120 > >> > [<c0198721>] unmap_vmas+0x4b1/0x8b0 > >> > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > >> > [<c019d504>] exit_mmap+0x84/0x120 > >> > [<c0138538>] mmput+0x48/0xa0 > >> > [<c013c3d7>] exit_mm+0xe7/0x110 > >> > [<c013d7a4>] do_exit+0x184/0x890 > >> > [<c013b55b>] ? printk+0x1b/0x20 > >> > [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 > >> > [<c01060f1>] oops_end+0xb1/0xc0 > >> > [<c01067c0>] die+0x50/0x70 > >> > [<c0122b4f>] do_page_fault+0x1ef/0xa20 > >> > [<c010b335>] ? native_sched_clock+0xb5/0x110 > >> > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > >> > [<c0122960>] ? do_page_fault+0x0/0xa20 > >> > [<c0687d3a>] error_code+0x72/0x78 > >> > [<c038ad65>] ? __list_add+0x15/0x90 > >> > [<c0687133>] ? _spin_lock+0x63/0x70 > >> > [<c018b954>] rmqueue_bulk+0x54/0x80 > >> > [<c018d317>] get_page_from_freelist+0x5a7/0x720 > >> > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > >> > [<c018dd50>] __alloc_pages_internal+0xa0/0x450 > >> > [<c01acd4b>] alloc_pages_current+0x7b/0xc0 > >> > [<c01b37fb>] new_slab+0x1bb/0x2d0 > >> > [<c0687877>] ? _spin_unlock+0x27/0x50 > >> > [<c01b40ca>] __slab_alloc+0x32a/0x4e0 > >> > [<c010b335>] ? native_sched_clock+0xb5/0x110 > >> > [<c01b4424>] kmem_cache_alloc+0xb4/0xe0 > >> > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > >> > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > >> > [<c018969e>] mempool_alloc_slab+0xe/0x10 > >> > [<c01897a1>] mempool_alloc+0x31/0xf0 > >> > [<c015e884>] ? trace_hardirqs_on_caller+0xd4/0x160 > >> > [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 > >> > [<c0368c7e>] get_request+0xae/0x2c0 > >> > [<c036935c>] get_request_wait+0x1c/0xd0 > >> > [<c0687462>] ? _spin_lock_irq+0x72/0x80 > >> > [<c0369442>] blk_get_request+0x32/0x70 > >> > [<c0471c1c>] generic_ide_resume+0x5c/0xf0 > > > > IDE again? > > > > Vegard, this is piix, isn't it? > > If this makes it so, then yes: > > calling piix_ide_init+0x0/0xb0 > initcall piix_ide_init+0x0/0xb0 returned 0 after 0 msecs > calling ide_scan_pcibus+0x0/0xf0 > piix 0000:00:1f.1: IDE controller (0x8086:0x27df rev 0x01) > piix 0000:00:1f.1: IDE port disabled > piix 0000:00:1f.1: not 100% native mode: will probe irqs later > ide0: BM-DMA at 0xffa0-0xffa7 > Probing IDE interface ide0... > hda: WDC WD1600BB-00DAA3, ATA DISK drive > hda: host max PIO4 wanted PIO255(auto-tune) selected PIO4 > hda: UDMA/100 mode selected > ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 > > It is also interesting that you mention that; this is from an earlier > run (before serial console was working properly): > > > In my last run, I managed to get a lot of ascii art on the screen, but > > also one line which gave me the EIP of the oops: > > > $ addr2line -e vmlinux -i c03724f3 > > block/cfq-iosched.c:1190 > > > /* > > * Must always be called with the rcu_read_lock() held > > */ > > static void > > __call_for_each_cic(struct io_context *ioc, > > void (*func)(struct io_context *, struct cfq_io_context *)) > > { > > struct cfq_io_context *cic; > > struct hlist_node *n; > > > > hlist_for_each_entry_rcu(cic, n, &ioc->cic_list, cic_list) <-- here > > func(ioc, cic); > > } Hmm. Would that be possible to switch temporarily to PATA/libata and see if the problem goes away? Then, we'd get a strong indication that it really is related to IDE. Thanks, Rafael ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-22 9:34 ` Rafael J. Wysocki @ 2008-08-22 10:01 ` Bartlomiej Zolnierkiewicz 2008-08-22 10:15 ` Rafael J. Wysocki 0 siblings, 1 reply; 21+ messages in thread From: Bartlomiej Zolnierkiewicz @ 2008-08-22 10:01 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Vegard Nossum, Pekka Enberg, Linux Kernel Mailing List, Andrew Morton, Jens Axboe Hi, On Friday 22 August 2008, Rafael J. Wysocki wrote: > On Friday, 22 of August 2008, Vegard Nossum wrote: > > On Fri, Aug 22, 2008 at 12:16 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > On Thursday, 21 of August 2008, Pekka Enberg wrote: > > >> > ============================================================================= > > >> > BUG blkdev_ioc: Invalid object pointer 0xf5cdaca8 > > >> > ----------------------------------------------------------------------------- > > >> > > >> Ok, here we have the block layer passing a bad pointer to SLUB this > > >> time. And it's also from the suspend code (although it's the resume > > >> path this time). As we never see an oops from the block layer first, > > >> it's possible that someone else corrupted everything and it just shows > > >> up in the block layer. Maybe something worth investigating, though. > > >> > > >> > INFO: Slab 0xf789e318 objects=14 used=14 fp=0x00000000 flags=0x2082083 > > >> > Pid: 3597, comm: bash Tainted: G D 2.6.27-rc4-00003-ga798564-dirty #30 > > >> > [<c01b2576>] slab_err+0x46/0x50 > > >> > [<c01b2766>] ? check_slab+0xd6/0xf0 > > >> > [<c0181aef>] ? call_rcu+0x6f/0x80 > > >> > [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 > > >> > [<c01b3c78>] __slab_free+0x238/0x360 > > >> > [<c01b4749>] kmem_cache_free+0xa9/0x120 > > >> > [<c036b773>] ? put_io_context+0x53/0x70 > > >> > [<c036b773>] ? put_io_context+0x53/0x70 > > >> > [<c036b773>] put_io_context+0x53/0x70 > > >> > [<c036b82e>] exit_io_context+0x6e/0x80 > > >> > [<c013de6e>] do_exit+0x84e/0x890 > > >> > [<c037b794>] ? trace_hardirqs_on_thunk+0xc/0x10 > > >> > [<c013b55b>] ? printk+0x1b/0x20 > > >> > [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 > > >> > [<c01060f1>] oops_end+0xb1/0xc0 > > >> > [<c01067c0>] die+0x50/0x70 > > >> > [<c0106871>] do_trap+0x91/0xc0 > > >> > [<c0106940>] ? do_invalid_op+0x0/0xa0 > > >> > [<c01069c8>] do_invalid_op+0x88/0xa0 > > >> > [<c01a0f39>] ? page_remove_rmap+0x109/0x120 > > >> > [<c013b2d1>] ? vprintk+0x151/0x3c0 > > >> > [<c013b45b>] ? vprintk+0x2db/0x3c0 > > >> > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > > >> > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > > >> > [<c0687d3a>] error_code+0x72/0x78 > > >> > [<c013007b>] ? sched_rt_period_timer+0x21b/0x270 > > >> > [<c01a0f39>] ? page_remove_rmap+0x109/0x120 > > >> > [<c0198721>] unmap_vmas+0x4b1/0x8b0 > > >> > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > > >> > [<c019d504>] exit_mmap+0x84/0x120 > > >> > [<c0138538>] mmput+0x48/0xa0 > > >> > [<c013c3d7>] exit_mm+0xe7/0x110 > > >> > [<c013d7a4>] do_exit+0x184/0x890 > > >> > [<c013b55b>] ? printk+0x1b/0x20 > > >> > [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 > > >> > [<c01060f1>] oops_end+0xb1/0xc0 > > >> > [<c01067c0>] die+0x50/0x70 > > >> > [<c0122b4f>] do_page_fault+0x1ef/0xa20 > > >> > [<c010b335>] ? native_sched_clock+0xb5/0x110 > > >> > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > > >> > [<c0122960>] ? do_page_fault+0x0/0xa20 > > >> > [<c0687d3a>] error_code+0x72/0x78 > > >> > [<c038ad65>] ? __list_add+0x15/0x90 > > >> > [<c0687133>] ? _spin_lock+0x63/0x70 > > >> > [<c018b954>] rmqueue_bulk+0x54/0x80 > > >> > [<c018d317>] get_page_from_freelist+0x5a7/0x720 > > >> > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > > >> > [<c018dd50>] __alloc_pages_internal+0xa0/0x450 > > >> > [<c01acd4b>] alloc_pages_current+0x7b/0xc0 > > >> > [<c01b37fb>] new_slab+0x1bb/0x2d0 > > >> > [<c0687877>] ? _spin_unlock+0x27/0x50 > > >> > [<c01b40ca>] __slab_alloc+0x32a/0x4e0 > > >> > [<c010b335>] ? native_sched_clock+0xb5/0x110 > > >> > [<c01b4424>] kmem_cache_alloc+0xb4/0xe0 > > >> > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > > >> > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > > >> > [<c018969e>] mempool_alloc_slab+0xe/0x10 > > >> > [<c01897a1>] mempool_alloc+0x31/0xf0 > > >> > [<c015e884>] ? trace_hardirqs_on_caller+0xd4/0x160 > > >> > [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 > > >> > [<c0368c7e>] get_request+0xae/0x2c0 > > >> > [<c036935c>] get_request_wait+0x1c/0xd0 > > >> > [<c0687462>] ? _spin_lock_irq+0x72/0x80 > > >> > [<c0369442>] blk_get_request+0x32/0x70 > > >> > [<c0471c1c>] generic_ide_resume+0x5c/0xf0 > > > > > > IDE again? "again"? generic_ide_resume() just tries to get new request from the block layer which seems to end up being problematic. > > > > > > Vegard, this is piix, isn't it? > > > > If this makes it so, then yes: > > > > calling piix_ide_init+0x0/0xb0 > > initcall piix_ide_init+0x0/0xb0 returned 0 after 0 msecs > > calling ide_scan_pcibus+0x0/0xf0 > > piix 0000:00:1f.1: IDE controller (0x8086:0x27df rev 0x01) > > piix 0000:00:1f.1: IDE port disabled > > piix 0000:00:1f.1: not 100% native mode: will probe irqs later > > ide0: BM-DMA at 0xffa0-0xffa7 > > Probing IDE interface ide0... > > hda: WDC WD1600BB-00DAA3, ATA DISK drive > > hda: host max PIO4 wanted PIO255(auto-tune) selected PIO4 > > hda: UDMA/100 mode selected > > ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 > > > > It is also interesting that you mention that; this is from an earlier > > run (before serial console was working properly): > > > > > In my last run, I managed to get a lot of ascii art on the screen, but > > > also one line which gave me the EIP of the oops: > > > > > $ addr2line -e vmlinux -i c03724f3 > > > block/cfq-iosched.c:1190 > > > > > /* > > > * Must always be called with the rcu_read_lock() held > > > */ > > > static void > > > __call_for_each_cic(struct io_context *ioc, > > > void (*func)(struct io_context *, struct cfq_io_context *)) > > > { > > > struct cfq_io_context *cic; > > > struct hlist_node *n; > > > > > > hlist_for_each_entry_rcu(cic, n, &ioc->cic_list, cic_list) <-- here > > > func(ioc, cic); > > > } > > Hmm. > > Would that be possible to switch temporarily to PATA/libata and see if the > problem goes away? Then, we'd get a strong indication that it really is > related to IDE. IIRC besides conversion to blk_{get,put}_request() conversion from 15th July (commit 5b114715ed63f3a4fdf790f5df61364fc4adadf1) there weren't any PM related IDE changes recently. Vegard, it would be worth to try if the kernels at the above commit and at commit 9a2d43b7566caeeeb414aa628bc2759028897dbb (one commit before the above one) are OK. If it doesn't give the definitive answer then doing git-bisect run would be of great help in identifying and fixing the source of the problem quickly. Thanks, Bart ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) 2008-08-22 10:01 ` Bartlomiej Zolnierkiewicz @ 2008-08-22 10:15 ` Rafael J. Wysocki 0 siblings, 0 replies; 21+ messages in thread From: Rafael J. Wysocki @ 2008-08-22 10:15 UTC (permalink / raw) To: Bartlomiej Zolnierkiewicz Cc: Vegard Nossum, Pekka Enberg, Linux Kernel Mailing List, Andrew Morton, Jens Axboe On Friday, 22 of August 2008, Bartlomiej Zolnierkiewicz wrote: > > Hi, > > On Friday 22 August 2008, Rafael J. Wysocki wrote: > > On Friday, 22 of August 2008, Vegard Nossum wrote: > > > On Fri, Aug 22, 2008 at 12:16 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > > On Thursday, 21 of August 2008, Pekka Enberg wrote: > > > >> > ============================================================================= > > > >> > BUG blkdev_ioc: Invalid object pointer 0xf5cdaca8 > > > >> > ----------------------------------------------------------------------------- > > > >> > > > >> Ok, here we have the block layer passing a bad pointer to SLUB this > > > >> time. And it's also from the suspend code (although it's the resume > > > >> path this time). As we never see an oops from the block layer first, > > > >> it's possible that someone else corrupted everything and it just shows > > > >> up in the block layer. Maybe something worth investigating, though. > > > >> > > > >> > INFO: Slab 0xf789e318 objects=14 used=14 fp=0x00000000 flags=0x2082083 > > > >> > Pid: 3597, comm: bash Tainted: G D 2.6.27-rc4-00003-ga798564-dirty #30 > > > >> > [<c01b2576>] slab_err+0x46/0x50 > > > >> > [<c01b2766>] ? check_slab+0xd6/0xf0 > > > >> > [<c0181aef>] ? call_rcu+0x6f/0x80 > > > >> > [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 > > > >> > [<c01b3c78>] __slab_free+0x238/0x360 > > > >> > [<c01b4749>] kmem_cache_free+0xa9/0x120 > > > >> > [<c036b773>] ? put_io_context+0x53/0x70 > > > >> > [<c036b773>] ? put_io_context+0x53/0x70 > > > >> > [<c036b773>] put_io_context+0x53/0x70 > > > >> > [<c036b82e>] exit_io_context+0x6e/0x80 > > > >> > [<c013de6e>] do_exit+0x84e/0x890 > > > >> > [<c037b794>] ? trace_hardirqs_on_thunk+0xc/0x10 > > > >> > [<c013b55b>] ? printk+0x1b/0x20 > > > >> > [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 > > > >> > [<c01060f1>] oops_end+0xb1/0xc0 > > > >> > [<c01067c0>] die+0x50/0x70 > > > >> > [<c0106871>] do_trap+0x91/0xc0 > > > >> > [<c0106940>] ? do_invalid_op+0x0/0xa0 > > > >> > [<c01069c8>] do_invalid_op+0x88/0xa0 > > > >> > [<c01a0f39>] ? page_remove_rmap+0x109/0x120 > > > >> > [<c013b2d1>] ? vprintk+0x151/0x3c0 > > > >> > [<c013b45b>] ? vprintk+0x2db/0x3c0 > > > >> > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > > > >> > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > > > >> > [<c0687d3a>] error_code+0x72/0x78 > > > >> > [<c013007b>] ? sched_rt_period_timer+0x21b/0x270 > > > >> > [<c01a0f39>] ? page_remove_rmap+0x109/0x120 > > > >> > [<c0198721>] unmap_vmas+0x4b1/0x8b0 > > > >> > [<c015c5ea>] ? print_lock_contention_bug+0x1a/0xe0 > > > >> > [<c019d504>] exit_mmap+0x84/0x120 > > > >> > [<c0138538>] mmput+0x48/0xa0 > > > >> > [<c013c3d7>] exit_mm+0xe7/0x110 > > > >> > [<c013d7a4>] do_exit+0x184/0x890 > > > >> > [<c013b55b>] ? printk+0x1b/0x20 > > > >> > [<c013a50a>] ? print_oops_end_marker+0x2a/0x30 > > > >> > [<c01060f1>] oops_end+0xb1/0xc0 > > > >> > [<c01067c0>] die+0x50/0x70 > > > >> > [<c0122b4f>] do_page_fault+0x1ef/0xa20 > > > >> > [<c010b335>] ? native_sched_clock+0xb5/0x110 > > > >> > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > > > >> > [<c0122960>] ? do_page_fault+0x0/0xa20 > > > >> > [<c0687d3a>] error_code+0x72/0x78 > > > >> > [<c038ad65>] ? __list_add+0x15/0x90 > > > >> > [<c0687133>] ? _spin_lock+0x63/0x70 > > > >> > [<c018b954>] rmqueue_bulk+0x54/0x80 > > > >> > [<c018d317>] get_page_from_freelist+0x5a7/0x720 > > > >> > [<c01600ea>] ? __lock_acquire+0x27a/0xa00 > > > >> > [<c018dd50>] __alloc_pages_internal+0xa0/0x450 > > > >> > [<c01acd4b>] alloc_pages_current+0x7b/0xc0 > > > >> > [<c01b37fb>] new_slab+0x1bb/0x2d0 > > > >> > [<c0687877>] ? _spin_unlock+0x27/0x50 > > > >> > [<c01b40ca>] __slab_alloc+0x32a/0x4e0 > > > >> > [<c010b335>] ? native_sched_clock+0xb5/0x110 > > > >> > [<c01b4424>] kmem_cache_alloc+0xb4/0xe0 > > > >> > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > > > >> > [<c018969e>] ? mempool_alloc_slab+0xe/0x10 > > > >> > [<c018969e>] mempool_alloc_slab+0xe/0x10 > > > >> > [<c01897a1>] mempool_alloc+0x31/0xf0 > > > >> > [<c015e884>] ? trace_hardirqs_on_caller+0xd4/0x160 > > > >> > [<c015e91b>] ? trace_hardirqs_on+0xb/0x10 > > > >> > [<c0368c7e>] get_request+0xae/0x2c0 > > > >> > [<c036935c>] get_request_wait+0x1c/0xd0 > > > >> > [<c0687462>] ? _spin_lock_irq+0x72/0x80 > > > >> > [<c0369442>] blk_get_request+0x32/0x70 > > > >> > [<c0471c1c>] generic_ide_resume+0x5c/0xf0 > > > > > > > > IDE again? > > "again"? There was another trace in which IDE was also involved. Sorry for the confusion. Rafael ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2008-08-22 10:11 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-08-21 17:28 latest -git: suspend: unable to handle kernel paging request (was Re: no_console_suspend doesn't work?) Vegard Nossum 2008-08-21 18:10 ` Rafael J. Wysocki 2008-08-21 18:21 ` Vegard Nossum 2008-08-21 18:45 ` Vegard Nossum 2008-08-21 19:13 ` Rafael J. Wysocki 2008-08-21 19:16 ` Vegard Nossum 2008-08-21 19:37 ` Rafael J. Wysocki 2008-08-21 20:08 ` Vegard Nossum 2008-08-21 20:22 ` Rafael J. Wysocki 2008-08-21 20:26 ` Andrew Morton 2008-08-21 21:21 ` Pekka Enberg 2008-08-21 22:10 ` Rafael J. Wysocki 2008-08-22 7:02 ` Pekka Enberg 2008-08-22 9:38 ` Pavel Machek 2008-08-21 21:27 ` Pekka Enberg 2008-08-21 22:16 ` Rafael J. Wysocki 2008-08-22 4:45 ` Vegard Nossum 2008-08-22 9:28 ` Pavel Machek 2008-08-22 9:34 ` Rafael J. Wysocki 2008-08-22 10:01 ` Bartlomiej Zolnierkiewicz 2008-08-22 10:15 ` Rafael J. Wysocki
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).