All of lore.kernel.org
 help / color / mirror / Atom feed
* [pv_ops domU] 2.6.30.5 - unable to handle kernel paging request / refresh_cpu_vm_stats / vmstat_update
@ 2009-08-18  0:26 Christopher S. Aker
  2009-08-18  4:30 ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 16+ messages in thread
From: Christopher S. Aker @ 2009-08-18  0:26 UTC (permalink / raw)
  To: xen devel; +Cc: Jeremy Fitzhardinge

Xen 3.3.4 64 bit
2.6.18 dom0 PAE
2.6.30.5 PAE domU

BUG: unable to handle kernel paging request at 96443ad8
IP: [<c0172b50>] refresh_cpu_vm_stats+0x70/0xc0
*pdpt = 000000047b6e8027
Oops: 0002 [#1] SMP
last sysfs file:
Modules linked in:

Pid: 14, comm: events/3 Not tainted (2.6.30.5-linode20 #1)
EIP: 0061:[<c0172b50>] EFLAGS: 00010246 CPU: 3
EIP is at refresh_cpu_vm_stats+0x70/0xc0
EAX: 96443a80 EBX: ffffffff ECX: 00000200 EDX: f57520c0
ESI: c06fb300 EDI: c06fb100 EBP: d6073f2c ESP: d6073f28
  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
Process events/3 (pid: 14, ti=d6072000 task=d605fc00 task.ti=d6072000)
Stack:
  00000200 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
  00000000 00000000 00000000 00000000 d7033584 d7036b80 d7033580 c0172ba0
Call Trace:
  [<c0172ba0>] ? vmstat_update+0x0/0x30
  [<c0172bab>] ? vmstat_update+0xb/0x30
  [<c013afac>] ? worker_thread+0x12c/0x1c0
  [<c011ee9b>] ? __wake_up_common+0x4b/0x80
  [<c013ddb0>] ? autoremove_wake_function+0x0/0x40
  [<c013ae80>] ? worker_thread+0x0/0x1c0
  [<c013dab9>] ? kthread+0x49/0x80
  [<c013da70>] ? kthread+0x0/0x80
  [<c0108757>] ? kernel_thread_helper+0x7/0x10
Code: 15 00 74 2f ff 15 1c 2c 6d c0 89 c1 ff 15 24 2c 6d c0 0f be 5c 16 
15 89 c8 c6 44 16 15 00 ff 15 20 2c 6d c0 8d 84 97 80 06 00 00 <f0> 01 
58 58 01 5c 95 00 83 c2 01 83 fa 13 75 c2 e8 4b 7c 42 00
EIP: [<c0172b50>] refresh_cpu_vm_stats+0x70/0xc0 SS:ESP 0069:d6073f28
CR2: 0000000096443ad8
---[ end trace c608c08376e3b403 ]---

Full log: http://p.linode.com/2866

-Chris

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [pv_ops domU] 2.6.30.5 - unable to handle kernel paging request / refresh_cpu_vm_stats / vmstat_update
  2009-08-18  0:26 [pv_ops domU] 2.6.30.5 - unable to handle kernel paging request / refresh_cpu_vm_stats / vmstat_update Christopher S. Aker
@ 2009-08-18  4:30 ` Jeremy Fitzhardinge
  2009-08-18 14:16   ` Christopher S. Aker
  0 siblings, 1 reply; 16+ messages in thread
From: Jeremy Fitzhardinge @ 2009-08-18  4:30 UTC (permalink / raw)
  To: Christopher S. Aker; +Cc: xen devel

On 08/17/09 17:26, Christopher S. Aker wrote:
> Xen 3.3.4 64 bit
> 2.6.18 dom0 PAE
> 2.6.30.5 PAE domU
>
> BUG: unable to handle kernel paging request at 96443ad8
> IP: [<c0172b50>] refresh_cpu_vm_stats+0x70/0xc0
> *pdpt = 000000047b6e8027
> Oops: 0002 [#1] SMP
> last sysfs file:
> Modules linked in:
>
> Pid: 14, comm: events/3 Not tainted (2.6.30.5-linode20 #1)
> EIP: 0061:[<c0172b50>] EFLAGS: 00010246 CPU: 3
> EIP is at refresh_cpu_vm_stats+0x70/0xc0
> EAX: 96443a80 EBX: ffffffff ECX: 00000200 EDX: f57520c0
> ESI: c06fb300 EDI: c06fb100 EBP: d6073f2c ESP: d6073f28
>  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
> Process events/3 (pid: 14, ti=d6072000 task=d605fc00 task.ti=d6072000)
> Stack:
>  00000200 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
>  00000000 00000000 00000000 00000000 d7033584 d7036b80 d7033580 c0172ba0
> Call Trace:
>  [<c0172ba0>] ? vmstat_update+0x0/0x30
>  [<c0172bab>] ? vmstat_update+0xb/0x30
>  [<c013afac>] ? worker_thread+0x12c/0x1c0
>  [<c011ee9b>] ? __wake_up_common+0x4b/0x80
>  [<c013ddb0>] ? autoremove_wake_function+0x0/0x40
>  [<c013ae80>] ? worker_thread+0x0/0x1c0
>  [<c013dab9>] ? kthread+0x49/0x80
>  [<c013da70>] ? kthread+0x0/0x80
>  [<c0108757>] ? kernel_thread_helper+0x7/0x10
> Code: 15 00 74 2f ff 15 1c 2c 6d c0 89 c1 ff 15 24 2c 6d c0 0f be 5c
> 16 15 89 c8 c6 44 16 15 00 ff 15 20 2c 6d c0 8d 84 97 80 06 00 00 <f0>
> 01 58 58 01 5c 95 00 83 c2 01 83 fa 13 75 c2 e8 4b 7c 42 00
> EIP: [<c0172b50>] refresh_cpu_vm_stats+0x70/0xc0 SS:ESP 0069:d6073f28
> CR2: 0000000096443ad8
> ---[ end trace c608c08376e3b403 ]---
>
> Full log: http://p.linode.com/2866

Does this happen every time, sometimes or just once?  Is it new with
2.6.30.5?  Are you using CONFIG_HIGHPTE?  What's the kernel config?

The top of the log appears to be missing.

Thanks,
    J

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [pv_ops domU] 2.6.30.5 - unable to handle kernel paging request / refresh_cpu_vm_stats / vmstat_update
  2009-08-18  4:30 ` Jeremy Fitzhardinge
@ 2009-08-18 14:16   ` Christopher S. Aker
  2009-08-18 20:15     ` Jeremy Fitzhardinge
  2009-08-21 16:16     ` Jeremy Fitzhardinge
  0 siblings, 2 replies; 16+ messages in thread
From: Christopher S. Aker @ 2009-08-18 14:16 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen devel

Jeremy Fitzhardinge wrote:
> Does this happen every time, sometimes or just once?

Completely reproducible, however I was wrong about the Xen version - it 
is 3.2.1-rc5.  FWIW, I'm unable to reproduce this on a dev box running 
3.4.1.  I haven't tried any other Xen versions (we have many running).

> Is it new with 2.6.30.5?

Yes.  The last pv_ops kernel we ran was 2.6.29.

> Are you using CONFIG_HIGHPTE?

# CONFIG_HIGHPTE is not set

> What's the kernel config?

http://p.linode.com/2869

> The top of the log appears to be missing.

Full log from another boot: http://p.linode.com/2868

Thanks!
-Chris

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [pv_ops domU] 2.6.30.5 - unable to handle kernel paging request / refresh_cpu_vm_stats / vmstat_update
  2009-08-18 14:16   ` Christopher S. Aker
@ 2009-08-18 20:15     ` Jeremy Fitzhardinge
  2009-08-18 20:56       ` Jeremy Fitzhardinge
  2009-08-20 19:11       ` Jed Smith
  2009-08-21 16:16     ` Jeremy Fitzhardinge
  1 sibling, 2 replies; 16+ messages in thread
From: Jeremy Fitzhardinge @ 2009-08-18 20:15 UTC (permalink / raw)
  To: Christopher S. Aker; +Cc: xen devel, Keir Fraser

On 08/18/09 07:16, Christopher S. Aker wrote:
> Jeremy Fitzhardinge wrote:
>> Does this happen every time, sometimes or just once?
>
> Completely reproducible, however I was wrong about the Xen version -
> it is 3.2.1-rc5.  FWIW, I'm unable to reproduce this on a dev box
> running 3.4.1.

That's curious.  Keir, do you know of any bugs which could cause domU
misbehaviour like this?

> I haven't tried any other Xen versions (we have many running).
>
>> Is it new with 2.6.30.5?
>
> Yes.  The last pv_ops kernel we ran was 2.6.29.

Have you tried any other distros?  I'll try to repro with a current Xen
and my Fedora system.

It would be helpful if you could try 2.6.30/.[1-4] as well to see if the
problem arises.  I'm trying a bisection myself, but it sounds like I
shouldn't be able to repo it.

    J

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Re: [pv_ops domU] 2.6.30.5 - unable to handle kernel paging request / refresh_cpu_vm_stats / vmstat_update
  2009-08-18 20:15     ` Jeremy Fitzhardinge
@ 2009-08-18 20:56       ` Jeremy Fitzhardinge
  2009-08-20 19:11       ` Jed Smith
  1 sibling, 0 replies; 16+ messages in thread
From: Jeremy Fitzhardinge @ 2009-08-18 20:56 UTC (permalink / raw)
  To: Christopher S. Aker; +Cc: xen devel, Keir Fraser

On 08/18/09 13:15, Jeremy Fitzhardinge wrote:
> It would be helpful if you could try 2.6.30/.[1-4] as well to see if the
> problem arises.  I'm trying a bisection myself, but it sounds like I
> shouldn't be able to repo it.
>   

Yeah, I had no problem with v2.6.30.5 running on Xen 3.3.1 (the
XenServer product version).  I haven't tried with your specific config yet.

    J

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Re: [pv_ops domU] 2.6.30.5 - unable to handle kernel paging request / refresh_cpu_vm_stats / vmstat_update
  2009-08-18 20:15     ` Jeremy Fitzhardinge
  2009-08-18 20:56       ` Jeremy Fitzhardinge
@ 2009-08-20 19:11       ` Jed Smith
  2009-08-20 23:31         ` Jeremy Fitzhardinge
  2009-08-21 16:16         ` Jeremy Fitzhardinge
  1 sibling, 2 replies; 16+ messages in thread
From: Jed Smith @ 2009-08-20 19:11 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen devel, Keir Fraser

Jeremy Fitzhardinge wrote:
>>> Is it new with 2.6.30.5?

Perhaps earlier, and we're just now running into it.  I am able to
reproduce on the v2.6.30 release.  My initial bisect leads me here (from
bad=v2.6.30 and good=v2.6.29 in linux-2.6.git):

commit 9049a11de73d3ecc623f1903100d099f82ede56c
Merge: c47c1b1 e4d0407
Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Date:   Wed Feb 11 11:52:22 2009 -0800

    Merge commit 'remotes/tip/x86/paravirt' into x86/untangle2

I note astutely, however, that's a pretty large merge commit.

> Have you tried any other distros?  I'll try to repro with a current Xen
> and my Fedora system.

I used an Arch domU to test, as this happens a few steps into init's run
there.  The process that bugs varies widely, but it's always a few
scripts in.  We can reproduce this on two versions of our software
stack, which both run Xen 3.2.1-rc5 (xm info from one):

release                : 2.6.18.8-524-1
version                : #1 SMP Tue Apr 22 16:31:28 EDT 2008
machine                : i686

xen_major              : 3
xen_minor              : 2
xen_extra              : .1-rc5
xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32
hvm-3.0-x86_32p hvm-3.0-x86_64
xen_scheduler          : credit
xen_pagesize           : 4096

cc_compiler            : gcc version 4.0.3 (Ubuntu 4.0.3-1ubuntu5)
cc_compile_date        : Fri Apr 11 11:24:13 EDT 2008

Newer hypervisors starting with v3.3.0 do not exhibit this behavior.

Now then, the bisection --

I ended up at 9049a11 in linux-2.6.git as told above, and I tried to
identify those patches in xen.git.  I'm not entirely sure my bisection
from that point was accurate (I could not reproduce a stack trace), and
I'll let you bisect it given your familiarity with xen.git.

I have a feeling version of hypervisor is important here as, again,
v3.3.0 and up do not BUG.

What's interesting is that they all stack trace, but the location
changes.  Here is an example from my bisection at f402a65:

-------

kernel BUG at kernel/sched.c:1184!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/block/ram11/removable
Modules linked in:

Pid: 1196, comm: sed Not tainted (2.6.29-rc4-bisect-00246-gf402a65 #12)
EIP: 0061:[<c011f1d3>] EFLAGS: 00010046 CPU: 2
EIP is at resched_task+0x63/0x70
EAX: 00000000 EBX: c05b3a80 ECX: 00000000 EDX: 00000000
ESI: d60d37f0 EDI: c12db200 EBP: 00000001 ESP: d4de9e20
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
Process sed (pid: 1196, ti=d4de8000 task=d5ae7030 task.ti=d4de8000)
Stack:
 c05b3a80 d6252030 c01260c7 00000001 00000003 00000000 d4f09e44 c12d1068
 00000001 00000001 c013f78b d4de9ea0 d4f09e44 c12d1068 c0120513 d4de9ea0
 00000003 c12d1074 c12d1070 d4de9ea0 00000001 00000200 c0120c4e 00000000
Call Trace:
 [<c01260c7>] try_to_wake_up+0xb7/0x1f0
 [<c013f78b>] autoremove_wake_function+0x1b/0x50
 [<c0120513>] __wake_up_common+0x43/0x70
 [<c0120c4e>] __wake_up+0x3e/0x60
 [<c013f6de>] __wake_up_bit+0x2e/0x40
 [<c0177719>] __do_fault+0x239/0x450
 [<c0165500>] filemap_fault+0x0/0x400
 [<c017962a>] handle_mm_fault+0x16a/0x900
 [<c01051ee>] __raw_callee_save_xen_restore_fl+0x6/0x8
 [<c018872c>] kfree+0x6c/0x80
 [<c0118e24>] do_page_fault+0x114/0x240
 [<c0118d10>] do_page_fault+0x0/0x240
 [<c05af91a>] error_code+0x72/0x78
Code: a1 04 61 79 c0 39 c2 74 0e 0f ae f0 89 f6 8b 46 04 f6 40 0c 04 74
09 5b 5e c3 8d b6 00 00 00 00 89 d0 ff 15 50 df 6d c0 5b 5e c3 <0f> 0b
eb fe 89 f6 8d bc 27 00 00 00 00 53 89 c3 8b 0c 85 80 f4
EIP: [<c011f1d3>] resched_task+0x63/0x70 SS:ESP 0069:d4de9e20

-------

I have saved everything from every bisect run, and uploaded it here:

   http://lateralus.jedsmith.org/

Let me know if I can help further.


Yours,

Jed Smith
Systems Developer
Linode, LLC
+1 (609) 593-7103 x1209
jsmith@linode.com
PGP: 0xA6611ED6


> 
>     J
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Re: [pv_ops domU] 2.6.30.5 - unable to handle kernel paging request / refresh_cpu_vm_stats / vmstat_update
  2009-08-20 19:11       ` Jed Smith
@ 2009-08-20 23:31         ` Jeremy Fitzhardinge
  2009-08-21  0:13           ` Jed Smith
  2009-08-21 16:16         ` Jeremy Fitzhardinge
  1 sibling, 1 reply; 16+ messages in thread
From: Jeremy Fitzhardinge @ 2009-08-20 23:31 UTC (permalink / raw)
  To: Jed Smith; +Cc: xen devel, Keir Fraser

On 08/20/09 12:11, Jed Smith wrote:
> Jeremy Fitzhardinge wrote:
>   
>>>> Is it new with 2.6.30.5?
>>>>         
> Perhaps earlier, and we're just now running into it.  I am able to
> reproduce on the v2.6.30 release.  My initial bisect leads me here (from
> bad=v2.6.30 and good=v2.6.29 in linux-2.6.git):
>
> commit 9049a11de73d3ecc623f1903100d099f82ede56c
> Merge: c47c1b1 e4d0407
> Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
> Date:   Wed Feb 11 11:52:22 2009 -0800
>
>     Merge commit 'remotes/tip/x86/paravirt' into x86/untangle2
>
> I note astutely, however, that's a pretty large merge commit.
>   

Yeah.  If it shows up in that merge, then obviously it must be one of
the constituent changes which is provoking the problem, unless there's
something about the merge itself.

Just to double check, can to test the two parents,
c47c1b1f3a9d6973108020df1dcab7604f7774dd and
e4d0407185cdbdcfd99fc23bde2e5454bbc46329, to see if one or both exhibits
the problem?

I see your bisect tested c47c1b1f3a9 as OK, so try e4d0407185.

    J

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Re: [pv_ops domU] 2.6.30.5 - unable to handle kernel paging request / refresh_cpu_vm_stats / vmstat_update
  2009-08-20 23:31         ` Jeremy Fitzhardinge
@ 2009-08-21  0:13           ` Jed Smith
  2009-08-21  0:38             ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 16+ messages in thread
From: Jed Smith @ 2009-08-21  0:13 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen devel, Keir Fraser

Jeremy Fitzhardinge wrote:
> I see your bisect tested c47c1b1f3a9 as OK, so try e4d0407185.

jsmith@lindev7:~/linux-2.6$ git reset --hard e4d0407185
HEAD is now at e4d0407 xen: use direct ops on 64-bit

http://lateralus.jedsmith.org/Jeremy1/

Fails as before there.  I bisected using that commit as the right side
before, but most of the kernels I built failed to boot before the first
printk(); I wasn't sure if there was something environmental going on.

Jed

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Re: [pv_ops domU] 2.6.30.5 - unable to handle kernel paging request / refresh_cpu_vm_stats / vmstat_update
  2009-08-21  0:13           ` Jed Smith
@ 2009-08-21  0:38             ` Jeremy Fitzhardinge
  2009-08-21 14:57               ` Jed Smith
  0 siblings, 1 reply; 16+ messages in thread
From: Jeremy Fitzhardinge @ 2009-08-21  0:38 UTC (permalink / raw)
  To: Jed Smith; +Cc: xen devel, Keir Fraser

On 08/20/09 17:13, Jed Smith wrote:
> Jeremy Fitzhardinge wrote:
>   
>> I see your bisect tested c47c1b1f3a9 as OK, so try e4d0407185.
>>     
> jsmith@lindev7:~/linux-2.6$ git reset --hard e4d0407185
> HEAD is now at e4d0407 xen: use direct ops on 64-bit
>
> http://lateralus.jedsmith.org/Jeremy1/
>
> Fails as before there.  I bisected using that commit as the right side
> before, but most of the kernels I built failed to boot before the first
> printk(); I wasn't sure if there was something environmental going on.
>   

Yes, there's a lot of stuff on that line which will not bisect well;
there were substantial changes to percpu data handling, and a change to
some of the core pvops calling conventions.  It will take care to choose
good bisection points to narrow things down.

But it also sounds like it could be on the Xen side.  Would it be
possible to try and narrow down where Xen stops breaking?

    J

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Re: [pv_ops domU] 2.6.30.5 - unable to handle kernel paging request / refresh_cpu_vm_stats / vmstat_update
  2009-08-21  0:38             ` Jeremy Fitzhardinge
@ 2009-08-21 14:57               ` Jed Smith
  2009-08-21 16:20                 ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 16+ messages in thread
From: Jed Smith @ 2009-08-21 14:57 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen devel, Keir Fraser

Jeremy Fitzhardinge wrote:
> But it also sounds like it could be on the Xen side.  Would it be
> possible to try and narrow down where Xen stops breaking?

We can narrow it down to between Xen 3.2.1-rc5 and 3.3.0.  Are you able
to reproduce on Xen 3.2.1-rc5 or earlier?

I have a Linode under 3.2.1-rc5, and you're more than welcome to use it
(I've been reproducing with pv-grub under that hypervisor).  Do you want
credentials to give it a whirl?

-Jed

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [pv_ops domU] 2.6.30.5 - unable to handle kernel paging request / refresh_cpu_vm_stats / vmstat_update
  2009-08-18 14:16   ` Christopher S. Aker
  2009-08-18 20:15     ` Jeremy Fitzhardinge
@ 2009-08-21 16:16     ` Jeremy Fitzhardinge
  1 sibling, 0 replies; 16+ messages in thread
From: Jeremy Fitzhardinge @ 2009-08-21 16:16 UTC (permalink / raw)
  To: Christopher S. Aker; +Cc: xen devel

On 08/18/09 07:16, Christopher S. Aker wrote:
>> What's the kernel config?
>
> http://p.linode.com/2869

That's the 2.6.18 dom0 config.  What's the crashing domU config?

    J

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Re: [pv_ops domU] 2.6.30.5 - unable to handle kernel paging request / refresh_cpu_vm_stats / vmstat_update
  2009-08-20 19:11       ` Jed Smith
  2009-08-20 23:31         ` Jeremy Fitzhardinge
@ 2009-08-21 16:16         ` Jeremy Fitzhardinge
  2009-08-21 17:02           ` Jed Smith
  1 sibling, 1 reply; 16+ messages in thread
From: Jeremy Fitzhardinge @ 2009-08-21 16:16 UTC (permalink / raw)
  To: Jed Smith; +Cc: xen devel, Keir Fraser

On 08/20/09 12:11, Jed Smith wrote:
> Jeremy Fitzhardinge wrote:
>   
>>>> Is it new with 2.6.30.5?
>>>>         
> Perhaps earlier, and we're just now running into it.  I am able to
> reproduce on the v2.6.30 release.  My initial bisect leads me here (from
> bad=v2.6.30 and good=v2.6.29 in linux-2.6.git):
>
> commit 9049a11de73d3ecc623f1903100d099f82ede56c
> Merge: c47c1b1 e4d0407
> Author: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
> Date:   Wed Feb 11 11:52:22 2009 -0800
>
>     Merge commit 'remotes/tip/x86/paravirt' into x86/untangle2
>
> I note astutely, however, that's a pretty large merge commit.
>
>   
>> Have you tried any other distros?  I'll try to repro with a current Xen
>> and my Fedora system.
>>     
> I used an Arch domU to test, as this happens a few steps into init's run
> there.  The process that bugs varies widely, but it's always a few
> scripts in.  We can reproduce this on two versions of our software
> stack, which both run Xen 3.2.1-rc5 (xm info from one):
>
> release                : 2.6.18.8-524-1
> version                : #1 SMP Tue Apr 22 16:31:28 EDT 2008
> machine                : i686
>
> xen_major              : 3
> xen_minor              : 2
> xen_extra              : .1-rc5
> xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32
> hvm-3.0-x86_32p hvm-3.0-x86_64
> xen_scheduler          : credit
> xen_pagesize           : 4096
>
> cc_compiler            : gcc version 4.0.3 (Ubuntu 4.0.3-1ubuntu5)
> cc_compile_date        : Fri Apr 11 11:24:13 EDT 2008
>
> Newer hypervisors starting with v3.3.0 do not exhibit this behavior.
>
> Now then, the bisection --
>
> I ended up at 9049a11 in linux-2.6.git as told above, and I tried to
> identify those patches in xen.git.  I'm not entirely sure my bisection
> from that point was accurate (I could not reproduce a stack trace), and
> I'll let you bisect it given your familiarity with xen.git.
>
> I have a feeling version of hypervisor is important here as, again,
> v3.3.0 and up do not BUG.
>
> What's interesting is that they all stack trace, but the location
> changes.  Here is an example from my bisection at f402a65:
>   

Do you have CONFIG_PARAVIRT_SPINLOCKS enabled?  That uses some
mechanisms that were not well exercised or tested on older versions of
Xen, and some fixes went in for them.

    J

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Re: [pv_ops domU] 2.6.30.5 - unable to handle kernel paging request / refresh_cpu_vm_stats / vmstat_update
  2009-08-21 14:57               ` Jed Smith
@ 2009-08-21 16:20                 ` Jeremy Fitzhardinge
  2009-08-21 16:30                   ` Jed Smith
  0 siblings, 1 reply; 16+ messages in thread
From: Jeremy Fitzhardinge @ 2009-08-21 16:20 UTC (permalink / raw)
  To: Jed Smith; +Cc: xen devel, Keir Fraser

On 08/21/09 07:57, Jed Smith wrote:
> I have a Linode under 3.2.1-rc5, and you're more than welcome to use it
> (I've been reproducing with pv-grub under that hypervisor).  Do you want
> credentials to give it a whirl?

Well, to be honest, I can't say I can get very excited about tracking
down where a bug *disappeared*.  Do you have a strong need to support
current kernels on old versions of Xen, or would it be simpler to update
Xen?

(Though I suspect that PARAVIRT_SPINLOCKS may be the explanation in this
case, assuming you have them enabled.)

    J

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Re: [pv_ops domU] 2.6.30.5 - unable to handle kernel paging request / refresh_cpu_vm_stats / vmstat_update
  2009-08-21 16:20                 ` Jeremy Fitzhardinge
@ 2009-08-21 16:30                   ` Jed Smith
  0 siblings, 0 replies; 16+ messages in thread
From: Jed Smith @ 2009-08-21 16:30 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen devel, Keir Fraser

Jeremy Fitzhardinge wrote:
> Well, to be honest, I can't say I can get very excited about tracking
> down where a bug *disappeared*.  Do you have a strong need to support
> current kernels on old versions of Xen, or would it be simpler to update
> Xen?

We understand!  You have a lot on your plate.  We had reached that
conclusion already, and we just wondered if there was an easy fix.
Soaking dev time into a bug that has disappeared is never appealing.
We'll queue upgrades for our affected machines.

> (Though I suspect that PARAVIRT_SPINLOCKS may be the explanation in this
> case, assuming you have them enabled.)

Don't have *PARAVIRT_SPINLOCK* in .config for my kernels...hm.  Just
stuff for debugging spinlocks and the usual PARAVIRT options.

Jed

> 
>     J
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Re: [pv_ops domU] 2.6.30.5 - unable to handle kernel paging request / refresh_cpu_vm_stats / vmstat_update
  2009-08-21 16:16         ` Jeremy Fitzhardinge
@ 2009-08-21 17:02           ` Jed Smith
  2009-08-21 17:44             ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 16+ messages in thread
From: Jed Smith @ 2009-08-21 17:02 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen devel, Keir Fraser

Jeremy Fitzhardinge wrote:
> Do you have CONFIG_PARAVIRT_SPINLOCKS enabled?  That uses some
> mechanisms that were not well exercised or tested on older versions of
> Xen, and some fixes went in for them.

And setting that to "n" fixes the problem.  Thanks!  Duly noted.

-Jed

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Re: [pv_ops domU] 2.6.30.5 - unable to handle kernel paging request / refresh_cpu_vm_stats / vmstat_update
  2009-08-21 17:02           ` Jed Smith
@ 2009-08-21 17:44             ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 16+ messages in thread
From: Jeremy Fitzhardinge @ 2009-08-21 17:44 UTC (permalink / raw)
  To: Jed Smith; +Cc: xen devel, Keir Fraser

On 08/21/09 10:02, Jed Smith wrote:
> Jeremy Fitzhardinge wrote:
>   
>> Do you have CONFIG_PARAVIRT_SPINLOCKS enabled?  That uses some
>> mechanisms that were not well exercised or tested on older versions of
>> Xen, and some fixes went in for them.
>>     
> And setting that to "n" fixes the problem.  Thanks!  Duly noted.
>   

OK, good to know.  I should probably put a version check in there.

    J

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2009-08-21 17:44 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-18  0:26 [pv_ops domU] 2.6.30.5 - unable to handle kernel paging request / refresh_cpu_vm_stats / vmstat_update Christopher S. Aker
2009-08-18  4:30 ` Jeremy Fitzhardinge
2009-08-18 14:16   ` Christopher S. Aker
2009-08-18 20:15     ` Jeremy Fitzhardinge
2009-08-18 20:56       ` Jeremy Fitzhardinge
2009-08-20 19:11       ` Jed Smith
2009-08-20 23:31         ` Jeremy Fitzhardinge
2009-08-21  0:13           ` Jed Smith
2009-08-21  0:38             ` Jeremy Fitzhardinge
2009-08-21 14:57               ` Jed Smith
2009-08-21 16:20                 ` Jeremy Fitzhardinge
2009-08-21 16:30                   ` Jed Smith
2009-08-21 16:16         ` Jeremy Fitzhardinge
2009-08-21 17:02           ` Jed Smith
2009-08-21 17:44             ` Jeremy Fitzhardinge
2009-08-21 16:16     ` Jeremy Fitzhardinge

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.