All of lore.kernel.org
 help / color / mirror / Atom feed
* kernel BUG at mm/swapfile.c:2527!
@ 2011-09-15 18:56 Shaun Reitan
  2011-09-15 19:52 ` Shaun Reitan
  0 siblings, 1 reply; 7+ messages in thread
From: Shaun Reitan @ 2011-09-15 18:56 UTC (permalink / raw)
  To: xen-devel

We've been seeing the following bugs hit.  This is happening with kernel 
versions 2.6.39 and 3.0.1.

So far we've only see this problem happen on ubuntu servers and it 
always seams to be the apache process that triggers it.  Also this time 
we were running a PCI compliance scan on the server.  We are thinking 
that may have triggered it.


2.6.39 Dump
------------[ cut here ]------------
kernel BUG at mm/swapfile.c:2527!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/vbd-51712/block/xvda/uevent
Modules linked in:

Pid: 30706, comm: apache2 Not tainted 2.6.39-2 #3
EIP: 0061:[<c01ab016>] EFLAGS: 00210246 CPU: 0
EIP is at swap_count_continued+0x176/0x190
EAX: 00000000 EBX: ebba0800 ECX: 80000001 EDX: f57ba95f
ESI: 00000080 EDI: ebbd7d40 EBP: 0000095f ESP: df4dbe38
  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
Process apache2 (pid: 30706, ti=df4da000 task=e9259bd0 task.ti=df4da000)
Stack:
  ea298d40 0000495f ee11a000 00000000 c01ab157 0000495f 00092be0 ea298d40
  b8f33000 c01ac277 00000000 00092be0 e91ed998 c019dba7 6afaa065 80000001
  00000000 00000000 c01065b3 c01036cd b9531fff 00000000 e8fdb348 df4dbf0c
Call Trace:
  [<c01ab157>] ? swap_entry_free+0x127/0x150
  [<c01ac277>] ? free_swap_and_cache+0x27/0xd0
  [<c019dba7>] ? unmap_vmas+0x587/0x7f0
  [<c01065b3>] ? xen_restore_fl_direct_reloc+0x4/0x4
  [<c01036cd>] ? xen_mc_flush+0xdd/0x190
  [<c01a1e0a>] ? exit_mmap+0x8a/0x140
  [<c0132aa1>] ? mmput+0x41/0xd0
  [<c0136afd>] ? exit_mm+0xed/0x110
  [<c0652710>] ? _raw_spin_lock_irq+0x10/0x20
  [<c01380d7>] ? do_exit+0x197/0x760
  [<c04417a7>] ? __xen_evtchn_do_upcall+0x1e7/0x240
  [<c0105d97>] ? xen_force_evtchn_callback+0x17/0x30
  [<c01386cf>] ? do_group_exit+0x2f/0x90
  [<c013873d>] ? sys_exit_group+0xd/0x10
  [<c0652a41>] ? syscall_call+0x7/0xb
  [<c0650000>] ? cpuup_callback+0x100/0x260
Code: d7 fe ff ff 89 d8 e8 7a 9f f7 ff 8d 54 05 00 c6 02 00 eb b0 0f 0b 
eb fe 0f 0b eb fe 89 f2 31 c0 80 fa 80 0f 94 c0 e9 b2 fe ff ff <0f> 0b 
eb fe 0f 0b eb fe 0f 0b eb fe 8d b4 26 00 00 00 00 8d bc
EIP: [<c01ab016>] swap_count_continued+0x176/0x190 SS:ESP 0069:df4dbe38
---[ end trace 9fa17c616c267728 ]---
Fixing recursive fault but reboot is needed!
BUG: scheduling while atomic: apache2/30706/0x00000001
Modules linked in:
Pid: 30706, comm: apache2 Tainted: G      D     2.6.39-2 #3
Call Trace:
  [<c065104f>] ? schedule+0x76f/0x840
  [<c01358ff>] ? vprintk+0x19f/0x3a0
  [<c01065bc>] ? check_events+0x8/0xc
  [<c0652731>] ? _raw_spin_unlock_irqrestore+0x11/0x20
  [<c01358ff>] ? vprintk+0x19f/0x3a0
  [<c01385ea>] ? do_exit+0x6aa/0x760
  [<c06526e7>] ? _raw_spin_lock_irqsave+0x27/0x40
  [<c0652731>] ? _raw_spin_unlock_irqrestore+0x11/0x20
  [<c0135016>] ? kmsg_dump+0x36/0xd0
  [<c0109b90>] ? do_bounds+0x80/0x80
  [<c0135b1b>] ? printk+0x1b/0x20
  [<c0109b90>] ? do_bounds+0x80/0x80
  [<c010b98f>] ? oops_end+0x9f/0xa0
  [<c0109c0f>] ? do_invalid_op+0x7f/0x90
  [<c01ab016>] ? swap_count_continued+0x176/0x190
  [<c018a939>] ? free_pcppages_bulk+0x2c9/0x2f0
  [<c0105d97>] ? xen_force_evtchn_callback+0x17/0x30
  [<c01065bc>] ? check_events+0x8/0xc
  [<c01065b3>] ? xen_restore_fl_direct_reloc+0x4/0x4
  [<c018b4f6>] ? free_hot_cold_page+0xd6/0x160
  [<c0103ff5>] ? pte_pfn_to_mfn+0xb5/0xd0
  [<c0104071>] ? xen_make_pte+0x41/0x110
  [<c0652fb6>] ? error_code+0x5a/0x60
  [<c0109b90>] ? do_bounds+0x80/0x80
  [<c01ab016>] ? swap_count_continued+0x176/0x190
  [<c01ab157>] ? swap_entry_free+0x127/0x150
  [<c01ac277>] ? free_swap_and_cache+0x27/0xd0
  [<c019dba7>] ? unmap_vmas+0x587/0x7f0
  [<c01065b3>] ? xen_restore_fl_direct_reloc+0x4/0x4
  [<c01036cd>] ? xen_mc_flush+0xdd/0x190
  [<c01a1e0a>] ? exit_mmap+0x8a/0x140
  [<c0132aa1>] ? mmput+0x41/0xd0
  [<c0136afd>] ? exit_mm+0xed/0x110
  [<c0652710>] ? _raw_spin_lock_irq+0x10/0x20
  [<c01380d7>] ? do_exit+0x197/0x760
  [<c04417a7>] ? __xen_evtchn_do_upcall+0x1e7/0x240
  [<c0105d97>] ? xen_force_evtchn_callback+0x17/0x30
  [<c01386cf>] ? do_group_exit+0x2f/0x90
  [<c013873d>] ? sys_exit_group+0xd/0x10
  [<c0652a41>] ? syscall_call+0x7/0xb
  [<c0650000>] ? cpuup_callback+0x100/0x260




Here's the 3.0.1 Dump, unfortunately i didn't catch a full dump.

  BUG: unable to handle kernel paging request at f57ba13c
IP: [<c01ae845>] swap_count_continued+0x85/0x190
*pdpt = 0000000000959027 *pde = 00000000008f5067 *pte = 0000000000000000
Oops: 0000 [#1] SMP
Modules linked in:

Pid: 3666, comm: apache2 Not tainted 3.0.1-1 #1
EIP: 0061:[<c01ae845>] EFLAGS: 00010246 CPU: 0
EIP is at swap_count_continued+0x85/0x190
EAX: 00000080 EBX: ed302400 ECX: ecb870a0 EDX: f57ba13c
ESI: 00000080 EDI: ed3d7760 EBP: 0000013c ESP: ea479dec
  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
Process apache2 (pid: 3666, ti=ea478000 task=ebe91bd0 task.ti=ea478000)
Stack:
  ec6915c0 0001913c ee129000 00000040 c01aea77 0001913c 00322780 ec6915c0
  b9275000 c01b0927 00000000 00322780 ea6533a8 c01a2d41 6e484067 80000001
  c01059ef 80000000 00000000 ebad13c0 eae13e48 ec6ebb1c ea479ee8 00000000
Call Trace:
  [<c01aea77>] ? swap_entry_free+0x127/0x150
  [<c01b0927>] ? free_swap_and_cache+0x27/0xd0
  [<c01a2d41>] ? zap_pte_range+0x321/0x420
  [<c01059ef>] ? xen_make_pte+0x3f/0xc0
  [<c01a2f98>] ? unmap_page_range+0x158/0x1a0
  [<c01a3058>] ? unmap_vmas+0x78/0xb0
  [<c01a524e>] ? exit_mmap+0x6e/0xf0
  [<c0136421>] ? mmput+0x41/0xd0
  [<c0139fcd>] ? exit_mm+0xed/0x110
  [<c06c76e0>] ? _raw_spin_lock_irq+0x10/0x20
  [<c013b7e7>] ? do_exit+0x197/0x340
  [<c01a5309>] ? remove_vma_list+0x39/0x50
  [<c013b9bf>] ? do_group_exit+0x2f/0x90
  [<c013ba2d>] ? sys_exit_group+0xd/0x10
  [<c06c7a11>] ? syscall_call+0x7/0xb
Code: 2a 90 8d 74 26 00 e9 15 01 00 00 89 d0 e8 c4 7e f7 ff 8b 5b 18 83 
eb 18 39 df 0f 84 e5 00 00 00 89 d8 e8 3f 81 f7 ff 8d 54 05 00 <0f> b6 
02 3c 80 74 d9 84 c0 0f 84 e2 00 00 00 83 e8 01 84 c0 88
EIP: [<c01ae845>] swap_count_continued+0x85/0x190 SS:ESP 0069:ea479dec
CR2: 00000000f57ba13c
---[ end trace 36a533bb83dd2812 ]---
Fixing recursive fault but reboot is needed!
BUG: scheduling while atomic: apache2/3666/0x00000001
Modules linked in:
Pid: 3666, comm: apache2 Tainted: G      D     3.0.1-1 #1
Call Trace:
  [<c06c60ed>] ? schedule+0x50d/0x520
  [<c0106a23>] ? xen_restore_fl_direct_reloc+0x4/0x4
  [<c01061d7>] ? xen_force_evtchn_callback+0x17/0x30
  [<c013b92f>] ? do_exit+0x2df/0x340
  [<c0138c3b>] ? printk+0x1b/0x20
  [<c010bf6f>] ? oops_end+0x9f/0xa0
  [<c0120f4f>] ? bad_area_nosemaphore+0xf/0x20
  [<c012149b>] ? do_page_fault+0x1bb/0x420
  [<c0177e85>] ? irq_get_irq_data+0x5/0x10
  [<c047da45>] ? info_for_irq+0x5/0x20
  [<c047e270>] ? evtchn_from_irq+0x10/0x40
  [<c01061d7>] ? xen_force_evtchn_callback+0x17/0x30
  [<c0106a2c>] ? check_events+0x8/0xc
  [<c0106a23>] ? xen_restore_fl_direct_reloc+0x4/0x4
  [<c0104bab>] ? xen_batched_set_pte+0xab/0xf0
  [<c01212e0>] ? vmalloc_fault+0x2c0/0x2c0
  [<c06c7f86>] ? error_code+0x5a/0x60
  [<c01212e0>] ? vmalloc_fault+0x2c0/0x2c0
  [<c01ae845>] ? swap_count_continued+0x85/0x190
  [<c01aea77>] ? swap_entry_free+0x127/0x150
  [<c01b0927>] ? free_swap_and_cache+0x27/0xd0
  [<c01a2d41>] ? zap_pte_range+0x321/0x420
  [<c01059ef>] ? xen_make_pte+0x3f/0xc0
  [<c01a2f98>] ? unmap_page_range+0x158/0x1a0
  [<c01a3058>] ? unmap_vmas+0x78/0xb0
  [<c01a524e>] ? exit_mmap+0x6e/0xf0
  [<c0136421>] ? mmput+0x41/0xd0
  [<c0139fcd>] ? exit_mm+0xed/0x110
  [<c06c76e0>] ? _raw_spin_lock_irq+0x10/0x20
  [<c013b7e7>] ? do_exit+0x197/0x340
  [<c01a5309>] ? remove_vma_list+0x39/0x50
  [<c013b9bf>] ? do_group_exit+0x2f/0x90
  [<c013ba2d>] ? sys_exit_group+0xd/0x10
  [<c06c7a11>] ? syscall_call+0x7/0xb




-- 
Shaun Retian
Chief Technical Officer
Network Data Center Host, Inc.
http://www.ndchost.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: kernel BUG at mm/swapfile.c:2527!
  2011-09-15 18:56 kernel BUG at mm/swapfile.c:2527! Shaun Reitan
@ 2011-09-15 19:52 ` Shaun Reitan
  2011-09-16  8:24   ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 7+ messages in thread
From: Shaun Reitan @ 2011-09-15 19:52 UTC (permalink / raw)
  To: xen-devel

I can just about reproduce this bug on the fly, a PCI compliance scan 
seams to be triggering it every time.  Let me know what you guys need!

~Shaun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: kernel BUG at mm/swapfile.c:2527!
  2011-09-15 19:52 ` Shaun Reitan
@ 2011-09-16  8:24   ` Konrad Rzeszutek Wilk
  2011-09-16 16:52     ` Shaun Reitan
  2011-09-20  4:33     ` Shaun Reitan
  0 siblings, 2 replies; 7+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-09-16  8:24 UTC (permalink / raw)
  To: Shaun Reitan; +Cc: xen-devel

On Thu, Sep 15, 2011 at 12:52:42PM -0700, Shaun Reitan wrote:
> I can just about reproduce this bug on the fly, a PCI compliance
> scan seams to be triggering it every time.  Let me know what you
> guys need!

How do I reproduce it? Is the PCI compliance easily available? Is
there any chance we can get access to the physical box to figure
out what is happening?

> 
> ~Shaun
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: kernel BUG at mm/swapfile.c:2527!
  2011-09-16  8:24   ` Konrad Rzeszutek Wilk
@ 2011-09-16 16:52     ` Shaun Reitan
  2011-09-20  4:33     ` Shaun Reitan
  1 sibling, 0 replies; 7+ messages in thread
From: Shaun Reitan @ 2011-09-16 16:52 UTC (permalink / raw)
  Cc: xen-devel

> How do I reproduce it? Is the PCI compliance easily available? Is
> there any chance we can get access to the physical box to figure
> out what is happening?

At this point I'm not able to reproduce the problem on the fly.  We had 
thought it was a PCI compliance scan that was triggering the error but 
now this customer is seeing the error constantly and the scans are not 
running.  I'm thrashing a test server that i attempted to setup exactly 
like this customers server and so far no crash.

The customers server is crashing like crazy, I'm attempting to figure 
out the trigger but it's proving difficult.  What do you need to see to 
figure out why it's crashing?  I'm willing to do whatever it takes but I 
cannot give you access to the host, but customer is willing to give you 
access to there virtual instance as a last resort.

~Shaun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: kernel BUG at mm/swapfile.c:2527!
  2011-09-16  8:24   ` Konrad Rzeszutek Wilk
  2011-09-16 16:52     ` Shaun Reitan
@ 2011-09-20  4:33     ` Shaun Reitan
  2011-09-22 11:06       ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 7+ messages in thread
From: Shaun Reitan @ 2011-09-20  4:33 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

On 9/16/2011 1:24 AM, Konrad Rzeszutek Wilk wrote:
> How do I reproduce it? Is the PCI compliance easily available? Is
> there any chance we can get access to the physical box to figure
> out what is happening?

Konrad,

did you get my email with the server I setup for you and logins?

-- 
Shaun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: kernel BUG at mm/swapfile.c:2527!
  2011-09-20  4:33     ` Shaun Reitan
@ 2011-09-22 11:06       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 7+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-09-22 11:06 UTC (permalink / raw)
  To: Shaun Reitan; +Cc: xen-devel

On Mon, Sep 19, 2011 at 09:33:40PM -0700, Shaun Reitan wrote:
> On 9/16/2011 1:24 AM, Konrad Rzeszutek Wilk wrote:
> >How do I reproduce it? Is the PCI compliance easily available? Is
> >there any chance we can get access to the physical box to figure
> >out what is happening?
> 
> Konrad,
> 
> did you get my email with the server I setup for you and logins?

Yup. Just came back from a conference so getting back to the groove.
> 
> -- 
> Shaun
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Re: kernel BUG at mm/swapfile.c:2527!
@ 2011-09-19 16:30 Kent Hoxsey
  0 siblings, 0 replies; 7+ messages in thread
From: Kent Hoxsey @ 2011-09-19 16:30 UTC (permalink / raw)
  To: xen-devel

Joining this thread lately as a follow-on from a similar problem that is happening in Amazon AWS instances. There is a thread on the AWS forums where an instance owner has figured out how to cause this bug on demand using apache and PHP:

https://forums.aws.amazon.com/thread.jspa?messageID=269851

In case those forums require a login, the php script to hit is:

<?php
$data = array();
for($x = 0; $x< 10000; $x++)
{
        for($y = 0; $y<1000; $y++){
                $data[][]=rand(1,100000);
        }
}
echo count($data);


I am not a PHP programmer, so unsure if that php tag needs to be closed or not, but that is what is posted on the forum. Run apache bench against your test URL with 200 concurrent connections.

My Amazon instance isn't running PHP but encounters a similar problem once a day (1:48pm Pacific). I cannot allow people onto the instance but am willing to run diagnostics and post them here.

Kent

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2011-09-22 11:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-15 18:56 kernel BUG at mm/swapfile.c:2527! Shaun Reitan
2011-09-15 19:52 ` Shaun Reitan
2011-09-16  8:24   ` Konrad Rzeszutek Wilk
2011-09-16 16:52     ` Shaun Reitan
2011-09-20  4:33     ` Shaun Reitan
2011-09-22 11:06       ` Konrad Rzeszutek Wilk
2011-09-19 16:30 Kent Hoxsey

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.