kernel BUG at mm/swapfile.c:2527!

* kernel BUG at mm/swapfile.c:2527!
@ 2011-09-15 18:56 Shaun Reitan
  2011-09-15 19:52 ` Shaun Reitan
  0 siblings, 1 reply; 7+ messages in thread
From: Shaun Reitan @ 2011-09-15 18:56 UTC (permalink / raw)
  To: xen-devel

We've been seeing the following bugs hit.  This is happening with kernel 
versions 2.6.39 and 3.0.1.

So far we've only see this problem happen on ubuntu servers and it 
always seams to be the apache process that triggers it.  Also this time 
we were running a PCI compliance scan on the server.  We are thinking 
that may have triggered it.

2.6.39 Dump
------------[ cut here ]------------
kernel BUG at mm/swapfile.c:2527!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/vbd-51712/block/xvda/uevent
Modules linked in:

Pid: 30706, comm: apache2 Not tainted 2.6.39-2 #3
EIP: 0061:[<c01ab016>] EFLAGS: 00210246 CPU: 0
EIP is at swap_count_continued+0x176/0x190
EAX: 00000000 EBX: ebba0800 ECX: 80000001 EDX: f57ba95f
ESI: 00000080 EDI: ebbd7d40 EBP: 0000095f ESP: df4dbe38
  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
Process apache2 (pid: 30706, ti=df4da000 task=e9259bd0 task.ti=df4da000)
Stack:
  ea298d40 0000495f ee11a000 00000000 c01ab157 0000495f 00092be0 ea298d40
  b8f33000 c01ac277 00000000 00092be0 e91ed998 c019dba7 6afaa065 80000001
  00000000 00000000 c01065b3 c01036cd b9531fff 00000000 e8fdb348 df4dbf0c
Call Trace:
  [<c01ab157>] ? swap_entry_free+0x127/0x150
  [<c01ac277>] ? free_swap_and_cache+0x27/0xd0
  [<c019dba7>] ? unmap_vmas+0x587/0x7f0
  [<c01065b3>] ? xen_restore_fl_direct_reloc+0x4/0x4
  [<c01036cd>] ? xen_mc_flush+0xdd/0x190
  [<c01a1e0a>] ? exit_mmap+0x8a/0x140
  [<c0132aa1>] ? mmput+0x41/0xd0
  [<c0136afd>] ? exit_mm+0xed/0x110
  [<c0652710>] ? _raw_spin_lock_irq+0x10/0x20
  [<c01380d7>] ? do_exit+0x197/0x760
  [<c04417a7>] ? __xen_evtchn_do_upcall+0x1e7/0x240
  [<c0105d97>] ? xen_force_evtchn_callback+0x17/0x30
  [<c01386cf>] ? do_group_exit+0x2f/0x90
  [<c013873d>] ? sys_exit_group+0xd/0x10
  [<c0652a41>] ? syscall_call+0x7/0xb
  [<c0650000>] ? cpuup_callback+0x100/0x260
Code: d7 fe ff ff 89 d8 e8 7a 9f f7 ff 8d 54 05 00 c6 02 00 eb b0 0f 0b 
eb fe 0f 0b eb fe 89 f2 31 c0 80 fa 80 0f 94 c0 e9 b2 fe ff ff <0f> 0b 
eb fe 0f 0b eb fe 0f 0b eb fe 8d b4 26 00 00 00 00 8d bc
EIP: [<c01ab016>] swap_count_continued+0x176/0x190 SS:ESP 0069:df4dbe38
---[ end trace 9fa17c616c267728 ]---
Fixing recursive fault but reboot is needed!
BUG: scheduling while atomic: apache2/30706/0x00000001
Modules linked in:
Pid: 30706, comm: apache2 Tainted: G      D     2.6.39-2 #3
Call Trace:
  [<c065104f>] ? schedule+0x76f/0x840
  [<c01358ff>] ? vprintk+0x19f/0x3a0
  [<c01065bc>] ? check_events+0x8/0xc
  [<c0652731>] ? _raw_spin_unlock_irqrestore+0x11/0x20
  [<c01358ff>] ? vprintk+0x19f/0x3a0
  [<c01385ea>] ? do_exit+0x6aa/0x760
  [<c06526e7>] ? _raw_spin_lock_irqsave+0x27/0x40
  [<c0652731>] ? _raw_spin_unlock_irqrestore+0x11/0x20
  [<c0135016>] ? kmsg_dump+0x36/0xd0
  [<c0109b90>] ? do_bounds+0x80/0x80
  [<c0135b1b>] ? printk+0x1b/0x20
  [<c0109b90>] ? do_bounds+0x80/0x80
  [<c010b98f>] ? oops_end+0x9f/0xa0
  [<c0109c0f>] ? do_invalid_op+0x7f/0x90
  [<c01ab016>] ? swap_count_continued+0x176/0x190
  [<c018a939>] ? free_pcppages_bulk+0x2c9/0x2f0
  [<c0105d97>] ? xen_force_evtchn_callback+0x17/0x30
  [<c01065bc>] ? check_events+0x8/0xc
  [<c01065b3>] ? xen_restore_fl_direct_reloc+0x4/0x4
  [<c018b4f6>] ? free_hot_cold_page+0xd6/0x160
  [<c0103ff5>] ? pte_pfn_to_mfn+0xb5/0xd0
  [<c0104071>] ? xen_make_pte+0x41/0x110
  [<c0652fb6>] ? error_code+0x5a/0x60
  [<c0109b90>] ? do_bounds+0x80/0x80
  [<c01ab016>] ? swap_count_continued+0x176/0x190
  [<c01ab157>] ? swap_entry_free+0x127/0x150
  [<c01ac277>] ? free_swap_and_cache+0x27/0xd0
  [<c019dba7>] ? unmap_vmas+0x587/0x7f0
  [<c01065b3>] ? xen_restore_fl_direct_reloc+0x4/0x4
  [<c01036cd>] ? xen_mc_flush+0xdd/0x190
  [<c01a1e0a>] ? exit_mmap+0x8a/0x140
  [<c0132aa1>] ? mmput+0x41/0xd0
  [<c0136afd>] ? exit_mm+0xed/0x110
  [<c0652710>] ? _raw_spin_lock_irq+0x10/0x20
  [<c01380d7>] ? do_exit+0x197/0x760
  [<c04417a7>] ? __xen_evtchn_do_upcall+0x1e7/0x240
  [<c0105d97>] ? xen_force_evtchn_callback+0x17/0x30
  [<c01386cf>] ? do_group_exit+0x2f/0x90
  [<c013873d>] ? sys_exit_group+0xd/0x10
  [<c0652a41>] ? syscall_call+0x7/0xb
  [<c0650000>] ? cpuup_callback+0x100/0x260

Here's the 3.0.1 Dump, unfortunately i didn't catch a full dump.

  BUG: unable to handle kernel paging request at f57ba13c
IP: [<c01ae845>] swap_count_continued+0x85/0x190
*pdpt = 0000000000959027 *pde = 00000000008f5067 *pte = 0000000000000000
Oops: 0000 [#1] SMP
Modules linked in:

Pid: 3666, comm: apache2 Not tainted 3.0.1-1 #1
EIP: 0061:[<c01ae845>] EFLAGS: 00010246 CPU: 0
EIP is at swap_count_continued+0x85/0x190
EAX: 00000080 EBX: ed302400 ECX: ecb870a0 EDX: f57ba13c
ESI: 00000080 EDI: ed3d7760 EBP: 0000013c ESP: ea479dec
  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
Process apache2 (pid: 3666, ti=ea478000 task=ebe91bd0 task.ti=ea478000)
Stack:
  ec6915c0 0001913c ee129000 00000040 c01aea77 0001913c 00322780 ec6915c0
  b9275000 c01b0927 00000000 00322780 ea6533a8 c01a2d41 6e484067 80000001
  c01059ef 80000000 00000000 ebad13c0 eae13e48 ec6ebb1c ea479ee8 00000000
Call Trace:
  [<c01aea77>] ? swap_entry_free+0x127/0x150
  [<c01b0927>] ? free_swap_and_cache+0x27/0xd0
  [<c01a2d41>] ? zap_pte_range+0x321/0x420
  [<c01059ef>] ? xen_make_pte+0x3f/0xc0
  [<c01a2f98>] ? unmap_page_range+0x158/0x1a0
  [<c01a3058>] ? unmap_vmas+0x78/0xb0
  [<c01a524e>] ? exit_mmap+0x6e/0xf0
  [<c0136421>] ? mmput+0x41/0xd0
  [<c0139fcd>] ? exit_mm+0xed/0x110
  [<c06c76e0>] ? _raw_spin_lock_irq+0x10/0x20
  [<c013b7e7>] ? do_exit+0x197/0x340
  [<c01a5309>] ? remove_vma_list+0x39/0x50
  [<c013b9bf>] ? do_group_exit+0x2f/0x90
  [<c013ba2d>] ? sys_exit_group+0xd/0x10
  [<c06c7a11>] ? syscall_call+0x7/0xb
Code: 2a 90 8d 74 26 00 e9 15 01 00 00 89 d0 e8 c4 7e f7 ff 8b 5b 18 83 
eb 18 39 df 0f 84 e5 00 00 00 89 d8 e8 3f 81 f7 ff 8d 54 05 00 <0f> b6 
02 3c 80 74 d9 84 c0 0f 84 e2 00 00 00 83 e8 01 84 c0 88
EIP: [<c01ae845>] swap_count_continued+0x85/0x190 SS:ESP 0069:ea479dec
CR2: 00000000f57ba13c
---[ end trace 36a533bb83dd2812 ]---
Fixing recursive fault but reboot is needed!
BUG: scheduling while atomic: apache2/3666/0x00000001
Modules linked in:
Pid: 3666, comm: apache2 Tainted: G      D     3.0.1-1 #1
Call Trace:
  [<c06c60ed>] ? schedule+0x50d/0x520
  [<c0106a23>] ? xen_restore_fl_direct_reloc+0x4/0x4
  [<c01061d7>] ? xen_force_evtchn_callback+0x17/0x30
  [<c013b92f>] ? do_exit+0x2df/0x340
  [<c0138c3b>] ? printk+0x1b/0x20
  [<c010bf6f>] ? oops_end+0x9f/0xa0
  [<c0120f4f>] ? bad_area_nosemaphore+0xf/0x20
  [<c012149b>] ? do_page_fault+0x1bb/0x420
  [<c0177e85>] ? irq_get_irq_data+0x5/0x10
  [<c047da45>] ? info_for_irq+0x5/0x20
  [<c047e270>] ? evtchn_from_irq+0x10/0x40
  [<c01061d7>] ? xen_force_evtchn_callback+0x17/0x30
  [<c0106a2c>] ? check_events+0x8/0xc
  [<c0106a23>] ? xen_restore_fl_direct_reloc+0x4/0x4
  [<c0104bab>] ? xen_batched_set_pte+0xab/0xf0
  [<c01212e0>] ? vmalloc_fault+0x2c0/0x2c0
  [<c06c7f86>] ? error_code+0x5a/0x60
  [<c01212e0>] ? vmalloc_fault+0x2c0/0x2c0
  [<c01ae845>] ? swap_count_continued+0x85/0x190
  [<c01aea77>] ? swap_entry_free+0x127/0x150
  [<c01b0927>] ? free_swap_and_cache+0x27/0xd0
  [<c01a2d41>] ? zap_pte_range+0x321/0x420
  [<c01059ef>] ? xen_make_pte+0x3f/0xc0
  [<c01a2f98>] ? unmap_page_range+0x158/0x1a0
  [<c01a3058>] ? unmap_vmas+0x78/0xb0
  [<c01a524e>] ? exit_mmap+0x6e/0xf0
  [<c0136421>] ? mmput+0x41/0xd0
  [<c0139fcd>] ? exit_mm+0xed/0x110
  [<c06c76e0>] ? _raw_spin_lock_irq+0x10/0x20
  [<c013b7e7>] ? do_exit+0x197/0x340
  [<c01a5309>] ? remove_vma_list+0x39/0x50
  [<c013b9bf>] ? do_group_exit+0x2f/0x90
  [<c013ba2d>] ? sys_exit_group+0xd/0x10
  [<c06c7a11>] ? syscall_call+0x7/0xb

-- 
Shaun Retian
Chief Technical Officer
Network Data Center Host, Inc.
http://www.ndchost.com

^ permalink raw reply	[flat|nested] 7+ messages in thread