All of lore.kernel.org
 help / color / mirror / Atom feed
* Xen BUG at mm-locks.h:118 in 4.2.1 - mm locking order violation - Dom0 reboot
@ 2013-02-18 10:47 Sylvain Munaut
  2013-02-18 11:05 ` Jan Beulich
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Sylvain Munaut @ 2013-02-18 10:47 UTC (permalink / raw)
  To: xen-devel

Hi,


I've just installed a self-built Xen 4.2.1 package on a debian wheezy
and when trying to run a HVM VM (that I was previously running with
the official xen 4.0 package on squeeze), it starts fine and I can
even use the VM for a few minutes then suddenly I loose all
communication with VM and the Dom0 and it just reboots ...

I enabled the xen serial console and this is what I got when the crash happens:


(XEN) mm locking order violation: 260 > 222
(XEN) Xen BUG at mm-locks.h:118
(XEN) ----[ Xen-4.2.1  x86_64  debug=n  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82c4801e15fd>] p2m_pod_demand_populate+0x87d/0x8a0
(XEN) RFLAGS: 0000000000010296   CONTEXT: hypervisor
(XEN) rax: ffff82c4802e8e20   rbx: ffff8302278ee820   rcx: 0000000000000000
(XEN) rdx: ffff82c48029ff18   rsi: 000000000000000a   rdi: ffff82c480258640
(XEN) rbp: 0000000000000000   rsp: ffff82c48029f978   r8:  0000000000000004
(XEN) r9:  0000000000000003   r10: 0000000000000002   r11: ffff82c4802c8c80
(XEN) r12: 0000000000000000   r13: ffff83022795f000   r14: 000000000005f70a
(XEN) r15: 000000000005fb0a   cr0: 0000000080050033   cr4: 00000000000026f0
(XEN) cr3: 00000002277ac000   cr2: 00000000d8b86058
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Xen stack trace from rsp=ffff82c48029f978:
(XEN)    000000000017e26b 000000000005fb0b ffff8302278eed08 000000010a040000
(XEN)    ffff82c48029ff18 600000017e26b067 6000000203243267 60000002279be467
(XEN)    0000000000000100 0000000000000000 ffff8302278ee820 000000000000a040
(XEN)    ffff82c48029faf4 ffff82c48016a4bd 0000000000000000 ffff82c4801d6666
(XEN)    0000000000000000 ffff82c48029ff18 0000002000000020 ffff82c48029faf4
(XEN)    ffff8302278ee820 ffff82c48029fa70 000000000000a040 000000000005fb0b
(XEN)    ffff82c48029fbec 0000000000000000 ffff8000002fd858 ffff8302278ee820
(XEN)    0000000000000006 ffff82c4801dbec3 ffff83022795fad0 ffff82c400000001
(XEN)    0000000000000001 000000005fb0b000 000000000000a040 806000000010b000
(XEN)    6000000172210267 60000002015bd467 ffff83017e26b000 0000000000000001
(XEN)    ffff8302278ee820 000000000005fb0b ffff82c48029fbec ffff82c48029fbf4
(XEN)    0000000000000000 ffff82c4801d6666 0000000000000000 0000000000000000
(XEN)    0000000000001e00 ffff83022795f000 ffff8300d7d10000 000000000005fb0b
(XEN)    ffff82c48029ff18 0000000080000b0e ffff8300d7d10000 ffff82c4801fa23f
(XEN)    ffff830000000001 ffff83022795f000 0000000000000008 0000000000001e00
(XEN)    00007d0a00000006 00000000b9fb2000 000000000003fae9 ffff83022795fb40
(XEN)    000000000017ecb9 00000000000b9fb2 ffff82c4802e9c60 ffff83022795fad0
(XEN)    ffff8300d7d10920 0000000000000060 ffff82c48029ff18 0000000000000002
(XEN)    0000000000000e78 0000000000000000 00000000d7d10000 0000000000000d90
(XEN)    0000000000000000 ffff82c4801cdda8 00000004fffe0080 0000000700000000
(XEN) Xen call trace:
(XEN)    [<ffff82c4801e15fd>] p2m_pod_demand_populate+0x87d/0x8a0
(XEN)    [<ffff82c48016a4bd>] get_page+0x2d/0x100
(XEN)    [<ffff82c4801d6666>] __get_gfn_type_access+0x86/0x260
(XEN)    [<ffff82c4801dbec3>] p2m_gfn_to_mfn+0x693/0x810
(XEN)    [<ffff82c4801d6666>] __get_gfn_type_access+0x86/0x260
(XEN)    [<ffff82c4801fa23f>] sh_page_fault__guest_3+0x24f/0x1e40
(XEN)    [<ffff82c4801cdda8>] vmx_update_guest_cr+0x78/0x5d0
(XEN)    [<ffff82c4801ae2da>] hvm_set_cr0+0x2ea/0x480
(XEN)    [<ffff82c4801b2bb4>] hvm_mov_to_cr+0xe4/0x1a0
(XEN)    [<ffff82c4801cfa63>] vmx_vmexit_handler+0xd33/0x1790
(XEN)    [<ffff82c4801cafb5>] vmx_do_resume+0xb5/0x170
(XEN)    [<ffff82c48015968c>] context_switch+0x15c/0xdf0
(XEN)    [<ffff82c480125d7b>] add_entry+0x4b/0xb0
(XEN)    [<ffff82c480125d7b>] add_entry+0x4b/0xb0
(XEN)    [<ffff82c4801bf3c7>] pt_update_irq+0x27/0x200
(XEN)    [<ffff82c480119830>] csched_tick+0x0/0x2e0
(XEN)    [<ffff82c4801bd5a1>] vlapic_has_pending_irq+0x21/0x60
(XEN)    [<ffff82c4801b5fca>] hvm_vcpu_has_pending_irq+0x4a/0x90
(XEN)    [<ffff82c4801c85c4>] vmx_intr_assist+0x54/0x290
(XEN)    [<ffff82c4801d2911>] nvmx_switch_guest+0x51/0x6c0
(XEN)    [<ffff82c4801d4256>] vmx_asm_do_vmentry+0x0/0xea
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Xen BUG at mm-locks.h:118
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...


Any suggestions ?

It is very reproducible and it's on a test machine I can reboot any
time, so if you need more debug info, I can collect it.
I don't have any different hw to test on unfortunately.


Cheers,

    Sylvain

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Xen BUG at mm-locks.h:118 in 4.2.1 - mm locking order violation - Dom0 reboot
  2013-02-18 10:47 Xen BUG at mm-locks.h:118 in 4.2.1 - mm locking order violation - Dom0 reboot Sylvain Munaut
@ 2013-02-18 11:05 ` Jan Beulich
  2013-02-18 11:09 ` Andrew Cooper
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Jan Beulich @ 2013-02-18 11:05 UTC (permalink / raw)
  To: Sylvain Munaut; +Cc: xen-devel

>>> On 18.02.13 at 11:47, Sylvain Munaut <s.munaut@whatever-company.com> wrote:
> It is very reproducible and it's on a test machine I can reboot any
> time, so if you need more debug info, I can collect it.
> I don't have any different hw to test on unfortunately.

Minimally you will want to let us know at what changeset you
cloned your tree.

Jan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Xen BUG at mm-locks.h:118 in 4.2.1 - mm locking order violation - Dom0 reboot
  2013-02-18 10:47 Xen BUG at mm-locks.h:118 in 4.2.1 - mm locking order violation - Dom0 reboot Sylvain Munaut
  2013-02-18 11:05 ` Jan Beulich
@ 2013-02-18 11:09 ` Andrew Cooper
  2013-02-18 11:13 ` Ian Campbell
  2013-02-18 11:35 ` Tim Deegan
  3 siblings, 0 replies; 12+ messages in thread
From: Andrew Cooper @ 2013-02-18 11:09 UTC (permalink / raw)
  To: Sylvain Munaut; +Cc: xen-devel

On 18/02/13 10:47, Sylvain Munaut wrote:
> Hi,
>
>
> I've just installed a self-built Xen 4.2.1 package on a debian wheezy
> and when trying to run a HVM VM (that I was previously running with
> the official xen 4.0 package on squeeze), it starts fine and I can
> even use the VM for a few minutes then suddenly I loose all
> communication with VM and the Dom0 and it just reboots ...
>
> I enabled the xen serial console and this is what I got when the crash happens:
>
>
> (XEN) mm locking order violation: 260 > 222
> (XEN) Xen BUG at mm-locks.h:118
> (XEN) ----[ Xen-4.2.1  x86_64  debug=n  Not tainted ]----
> (XEN) CPU:    0
> (XEN) RIP:    e008:[<ffff82c4801e15fd>] p2m_pod_demand_populate+0x87d/0x8a0
> (XEN) RFLAGS: 0000000000010296   CONTEXT: hypervisor
> (XEN) rax: ffff82c4802e8e20   rbx: ffff8302278ee820   rcx: 0000000000000000
> (XEN) rdx: ffff82c48029ff18   rsi: 000000000000000a   rdi: ffff82c480258640
> (XEN) rbp: 0000000000000000   rsp: ffff82c48029f978   r8:  0000000000000004
> (XEN) r9:  0000000000000003   r10: 0000000000000002   r11: ffff82c4802c8c80
> (XEN) r12: 0000000000000000   r13: ffff83022795f000   r14: 000000000005f70a
> (XEN) r15: 000000000005fb0a   cr0: 0000000080050033   cr4: 00000000000026f0
> (XEN) cr3: 00000002277ac000   cr2: 00000000d8b86058
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
> (XEN) Xen stack trace from rsp=ffff82c48029f978:
> (XEN)    000000000017e26b 000000000005fb0b ffff8302278eed08 000000010a040000
> (XEN)    ffff82c48029ff18 600000017e26b067 6000000203243267 60000002279be467
> (XEN)    0000000000000100 0000000000000000 ffff8302278ee820 000000000000a040
> (XEN)    ffff82c48029faf4 ffff82c48016a4bd 0000000000000000 ffff82c4801d6666
> (XEN)    0000000000000000 ffff82c48029ff18 0000002000000020 ffff82c48029faf4
> (XEN)    ffff8302278ee820 ffff82c48029fa70 000000000000a040 000000000005fb0b
> (XEN)    ffff82c48029fbec 0000000000000000 ffff8000002fd858 ffff8302278ee820
> (XEN)    0000000000000006 ffff82c4801dbec3 ffff83022795fad0 ffff82c400000001
> (XEN)    0000000000000001 000000005fb0b000 000000000000a040 806000000010b000
> (XEN)    6000000172210267 60000002015bd467 ffff83017e26b000 0000000000000001
> (XEN)    ffff8302278ee820 000000000005fb0b ffff82c48029fbec ffff82c48029fbf4
> (XEN)    0000000000000000 ffff82c4801d6666 0000000000000000 0000000000000000
> (XEN)    0000000000001e00 ffff83022795f000 ffff8300d7d10000 000000000005fb0b
> (XEN)    ffff82c48029ff18 0000000080000b0e ffff8300d7d10000 ffff82c4801fa23f
> (XEN)    ffff830000000001 ffff83022795f000 0000000000000008 0000000000001e00
> (XEN)    00007d0a00000006 00000000b9fb2000 000000000003fae9 ffff83022795fb40
> (XEN)    000000000017ecb9 00000000000b9fb2 ffff82c4802e9c60 ffff83022795fad0
> (XEN)    ffff8300d7d10920 0000000000000060 ffff82c48029ff18 0000000000000002
> (XEN)    0000000000000e78 0000000000000000 00000000d7d10000 0000000000000d90
> (XEN)    0000000000000000 ffff82c4801cdda8 00000004fffe0080 0000000700000000
> (XEN) Xen call trace:
> (XEN)    [<ffff82c4801e15fd>] p2m_pod_demand_populate+0x87d/0x8a0
> (XEN)    [<ffff82c48016a4bd>] get_page+0x2d/0x100
> (XEN)    [<ffff82c4801d6666>] __get_gfn_type_access+0x86/0x260
> (XEN)    [<ffff82c4801dbec3>] p2m_gfn_to_mfn+0x693/0x810
> (XEN)    [<ffff82c4801d6666>] __get_gfn_type_access+0x86/0x260
> (XEN)    [<ffff82c4801fa23f>] sh_page_fault__guest_3+0x24f/0x1e40
> (XEN)    [<ffff82c4801cdda8>] vmx_update_guest_cr+0x78/0x5d0
> (XEN)    [<ffff82c4801ae2da>] hvm_set_cr0+0x2ea/0x480
> (XEN)    [<ffff82c4801b2bb4>] hvm_mov_to_cr+0xe4/0x1a0
> (XEN)    [<ffff82c4801cfa63>] vmx_vmexit_handler+0xd33/0x1790
> (XEN)    [<ffff82c4801cafb5>] vmx_do_resume+0xb5/0x170
> (XEN)    [<ffff82c48015968c>] context_switch+0x15c/0xdf0
> (XEN)    [<ffff82c480125d7b>] add_entry+0x4b/0xb0
> (XEN)    [<ffff82c480125d7b>] add_entry+0x4b/0xb0
> (XEN)    [<ffff82c4801bf3c7>] pt_update_irq+0x27/0x200
> (XEN)    [<ffff82c480119830>] csched_tick+0x0/0x2e0
> (XEN)    [<ffff82c4801bd5a1>] vlapic_has_pending_irq+0x21/0x60
> (XEN)    [<ffff82c4801b5fca>] hvm_vcpu_has_pending_irq+0x4a/0x90
> (XEN)    [<ffff82c4801c85c4>] vmx_intr_assist+0x54/0x290
> (XEN)    [<ffff82c4801d2911>] nvmx_switch_guest+0x51/0x6c0
> (XEN)    [<ffff82c4801d4256>] vmx_asm_do_vmentry+0x0/0xea
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) Xen BUG at mm-locks.h:118
> (XEN) ****************************************
> (XEN)
> (XEN) Reboot in five seconds...
>
>
> Any suggestions ?
>
> It is very reproducible and it's on a test machine I can reboot any
> time, so if you need more debug info, I can collect it.
> I don't have any different hw to test on unfortunately.
>
>
> Cheers,
>
>     Sylvain

>From the stack trace, I assume that the guest is running in shadow mode
?  Can you confirm this?

~Andrew

>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Xen BUG at mm-locks.h:118 in 4.2.1 - mm locking order violation - Dom0 reboot
  2013-02-18 10:47 Xen BUG at mm-locks.h:118 in 4.2.1 - mm locking order violation - Dom0 reboot Sylvain Munaut
  2013-02-18 11:05 ` Jan Beulich
  2013-02-18 11:09 ` Andrew Cooper
@ 2013-02-18 11:13 ` Ian Campbell
  2013-02-18 11:35 ` Tim Deegan
  3 siblings, 0 replies; 12+ messages in thread
From: Ian Campbell @ 2013-02-18 11:13 UTC (permalink / raw)
  To: Sylvain Munaut; +Cc: George Dunlap, Tim Deegan, xen-devel

On Mon, 2013-02-18 at 10:47 +0000, Sylvain Munaut wrote:
> Hi,
> 
> 
> I've just installed a self-built Xen 4.2.1 package on a debian wheezy

Is this exactly 4.2.1, some later revision from 4.2-testing or otherwise
patches? Can you let us know the comit id.

> and when trying to run a HVM VM (that I was previously running with
> the official xen 4.0 package on squeeze), it starts fine and I can
> even use the VM for a few minutes then suddenly I loose all
> communication with VM and the Dom0 and it just reboots ...

Please can you share the domain configuration. Are you running PV
drivers (esp. ballooning) within it?

> I enabled the xen serial console and this is what I got when the crash happens:
> 
> 
> (XEN) mm locking order violation: 260 > 222

260 == pod lock, 222 is the p2m lock. I've CCd George and Tim.

> (XEN) Xen BUG at mm-locks.h:118
> (XEN) ----[ Xen-4.2.1  x86_64  debug=n  Not tainted ]----
> (XEN) CPU:    0
> (XEN) RIP:    e008:[<ffff82c4801e15fd>] p2m_pod_demand_populate+0x87d/0x8a0
> (XEN) RFLAGS: 0000000000010296   CONTEXT: hypervisor
> (XEN) rax: ffff82c4802e8e20   rbx: ffff8302278ee820   rcx: 0000000000000000
> (XEN) rdx: ffff82c48029ff18   rsi: 000000000000000a   rdi: ffff82c480258640
> (XEN) rbp: 0000000000000000   rsp: ffff82c48029f978   r8:  0000000000000004
> (XEN) r9:  0000000000000003   r10: 0000000000000002   r11: ffff82c4802c8c80
> (XEN) r12: 0000000000000000   r13: ffff83022795f000   r14: 000000000005f70a
> (XEN) r15: 000000000005fb0a   cr0: 0000000080050033   cr4: 00000000000026f0
> (XEN) cr3: 00000002277ac000   cr2: 00000000d8b86058
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
> (XEN) Xen stack trace from rsp=ffff82c48029f978:
> (XEN)    000000000017e26b 000000000005fb0b ffff8302278eed08 000000010a040000
> (XEN)    ffff82c48029ff18 600000017e26b067 6000000203243267 60000002279be467
> (XEN)    0000000000000100 0000000000000000 ffff8302278ee820 000000000000a040
> (XEN)    ffff82c48029faf4 ffff82c48016a4bd 0000000000000000 ffff82c4801d6666
> (XEN)    0000000000000000 ffff82c48029ff18 0000002000000020 ffff82c48029faf4
> (XEN)    ffff8302278ee820 ffff82c48029fa70 000000000000a040 000000000005fb0b
> (XEN)    ffff82c48029fbec 0000000000000000 ffff8000002fd858 ffff8302278ee820
> (XEN)    0000000000000006 ffff82c4801dbec3 ffff83022795fad0 ffff82c400000001
> (XEN)    0000000000000001 000000005fb0b000 000000000000a040 806000000010b000
> (XEN)    6000000172210267 60000002015bd467 ffff83017e26b000 0000000000000001
> (XEN)    ffff8302278ee820 000000000005fb0b ffff82c48029fbec ffff82c48029fbf4
> (XEN)    0000000000000000 ffff82c4801d6666 0000000000000000 0000000000000000
> (XEN)    0000000000001e00 ffff83022795f000 ffff8300d7d10000 000000000005fb0b
> (XEN)    ffff82c48029ff18 0000000080000b0e ffff8300d7d10000 ffff82c4801fa23f
> (XEN)    ffff830000000001 ffff83022795f000 0000000000000008 0000000000001e00
> (XEN)    00007d0a00000006 00000000b9fb2000 000000000003fae9 ffff83022795fb40
> (XEN)    000000000017ecb9 00000000000b9fb2 ffff82c4802e9c60 ffff83022795fad0
> (XEN)    ffff8300d7d10920 0000000000000060 ffff82c48029ff18 0000000000000002
> (XEN)    0000000000000e78 0000000000000000 00000000d7d10000 0000000000000d90
> (XEN)    0000000000000000 ffff82c4801cdda8 00000004fffe0080 0000000700000000
> (XEN) Xen call trace:
> (XEN)    [<ffff82c4801e15fd>] p2m_pod_demand_populate+0x87d/0x8a0
> (XEN)    [<ffff82c48016a4bd>] get_page+0x2d/0x100
> (XEN)    [<ffff82c4801d6666>] __get_gfn_type_access+0x86/0x260
> (XEN)    [<ffff82c4801dbec3>] p2m_gfn_to_mfn+0x693/0x810
> (XEN)    [<ffff82c4801d6666>] __get_gfn_type_access+0x86/0x260
> (XEN)    [<ffff82c4801fa23f>] sh_page_fault__guest_3+0x24f/0x1e40
> (XEN)    [<ffff82c4801cdda8>] vmx_update_guest_cr+0x78/0x5d0
> (XEN)    [<ffff82c4801ae2da>] hvm_set_cr0+0x2ea/0x480
> (XEN)    [<ffff82c4801b2bb4>] hvm_mov_to_cr+0xe4/0x1a0
> (XEN)    [<ffff82c4801cfa63>] vmx_vmexit_handler+0xd33/0x1790
> (XEN)    [<ffff82c4801cafb5>] vmx_do_resume+0xb5/0x170
> (XEN)    [<ffff82c48015968c>] context_switch+0x15c/0xdf0
> (XEN)    [<ffff82c480125d7b>] add_entry+0x4b/0xb0
> (XEN)    [<ffff82c480125d7b>] add_entry+0x4b/0xb0
> (XEN)    [<ffff82c4801bf3c7>] pt_update_irq+0x27/0x200
> (XEN)    [<ffff82c480119830>] csched_tick+0x0/0x2e0
> (XEN)    [<ffff82c4801bd5a1>] vlapic_has_pending_irq+0x21/0x60
> (XEN)    [<ffff82c4801b5fca>] hvm_vcpu_has_pending_irq+0x4a/0x90
> (XEN)    [<ffff82c4801c85c4>] vmx_intr_assist+0x54/0x290
> (XEN)    [<ffff82c4801d2911>] nvmx_switch_guest+0x51/0x6c0
> (XEN)    [<ffff82c4801d4256>] vmx_asm_do_vmentry+0x0/0xea
> (XEN)
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 0:
> (XEN) Xen BUG at mm-locks.h:118
> (XEN) ****************************************
> (XEN)
> (XEN) Reboot in five seconds...
> 
> 
> Any suggestions ?
> 
> It is very reproducible and it's on a test machine I can reboot any
> time, so if you need more debug info, I can collect it.
> I don't have any different hw to test on unfortunately.
> 
> 
> Cheers,
> 
>     Sylvain
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Xen BUG at mm-locks.h:118 in 4.2.1 - mm locking order violation - Dom0 reboot
  2013-02-18 10:47 Xen BUG at mm-locks.h:118 in 4.2.1 - mm locking order violation - Dom0 reboot Sylvain Munaut
                   ` (2 preceding siblings ...)
  2013-02-18 11:13 ` Ian Campbell
@ 2013-02-18 11:35 ` Tim Deegan
  2013-02-18 13:17   ` Sylvain Munaut
  2013-02-18 14:47   ` Sylvain Munaut
  3 siblings, 2 replies; 12+ messages in thread
From: Tim Deegan @ 2013-02-18 11:35 UTC (permalink / raw)
  To: Sylvain Munaut; +Cc: xen-devel

[-- Attachment #1: Type: text/plain, Size: 2288 bytes --]

Hi, 

Thanks for the report. 

At 11:47 +0100 on 18 Feb (1361188042), Sylvain Munaut wrote:
> I've just installed a self-built Xen 4.2.1 package on a debian wheezy
> and when trying to run a HVM VM (that I was previously running with
> the official xen 4.0 package on squeeze), it starts fine and I can
> even use the VM for a few minutes then suddenly I loose all
> communication with VM and the Dom0 and it just reboots ...

Did you make any changes to Xen before you built it, or were you just
building your own to get 4.2?

> (XEN) mm locking order violation: 260 > 222
> (XEN) Xen BUG at mm-locks.h:118

Hmm, taking the p2m lock with the pod lock held. :( My guess would be
the p2m_lock() in p2m_pod_emergency_sweep().

Do you by any chance have the xen-syms file from when you built Xen?
That would let us see exactly what's happened.

In the meantime, perhaps you could try the attached (untested) patch.
If my guess is right, it ought to stop the crashes but you might find
the VM's performance suffers.

Cheers,

Tim.

> (XEN)    [<ffff82c4801e15fd>] p2m_pod_demand_populate+0x87d/0x8a0
> (XEN)    [<ffff82c48016a4bd>] get_page+0x2d/0x100
> (XEN)    [<ffff82c4801d6666>] __get_gfn_type_access+0x86/0x260
> (XEN)    [<ffff82c4801dbec3>] p2m_gfn_to_mfn+0x693/0x810
> (XEN)    [<ffff82c4801d6666>] __get_gfn_type_access+0x86/0x260
> (XEN)    [<ffff82c4801fa23f>] sh_page_fault__guest_3+0x24f/0x1e40
> (XEN)    [<ffff82c4801cdda8>] vmx_update_guest_cr+0x78/0x5d0
> (XEN)    [<ffff82c4801ae2da>] hvm_set_cr0+0x2ea/0x480
> (XEN)    [<ffff82c4801b2bb4>] hvm_mov_to_cr+0xe4/0x1a0
> (XEN)    [<ffff82c4801cfa63>] vmx_vmexit_handler+0xd33/0x1790
> (XEN)    [<ffff82c4801cafb5>] vmx_do_resume+0xb5/0x170
> (XEN)    [<ffff82c48015968c>] context_switch+0x15c/0xdf0
> (XEN)    [<ffff82c480125d7b>] add_entry+0x4b/0xb0
> (XEN)    [<ffff82c480125d7b>] add_entry+0x4b/0xb0
> (XEN)    [<ffff82c4801bf3c7>] pt_update_irq+0x27/0x200
> (XEN)    [<ffff82c480119830>] csched_tick+0x0/0x2e0
> (XEN)    [<ffff82c4801bd5a1>] vlapic_has_pending_irq+0x21/0x60
> (XEN)    [<ffff82c4801b5fca>] hvm_vcpu_has_pending_irq+0x4a/0x90
> (XEN)    [<ffff82c4801c85c4>] vmx_intr_assist+0x54/0x290
> (XEN)    [<ffff82c4801d2911>] nvmx_switch_guest+0x51/0x6c0
> (XEN)    [<ffff82c4801d4256>] vmx_asm_do_vmentry+0x0/0xea

[-- Attachment #2: x --]
[-- Type: text/plain, Size: 794 bytes --]

diff -r 4efc7f87d749 xen/arch/x86/mm/p2m.c
--- a/xen/arch/x86/mm/p2m.c	Thu Feb 14 17:07:41 2013 +0000
+++ b/xen/arch/x86/mm/p2m.c	Mon Feb 18 11:32:44 2013 +0000
@@ -219,7 +219,7 @@ mfn_t __get_gfn_type_access(struct p2m_d
     }
 
     /* For now only perform locking on hap domains */
-    if ( locked && (hap_enabled(p2m->domain)) )
+    if ( locked )
         /* Grab the lock here, don't release until put_gfn */
         gfn_lock(p2m, gfn, 0);
 
@@ -248,8 +248,7 @@ mfn_t __get_gfn_type_access(struct p2m_d
 
 void __put_gfn(struct p2m_domain *p2m, unsigned long gfn)
 {
-    if ( !p2m || !paging_mode_translate(p2m->domain) 
-              || !hap_enabled(p2m->domain) )
+    if ( !p2m || !paging_mode_translate(p2m->domain) )
         /* Nothing to do in this case */
         return;
 

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Xen BUG at mm-locks.h:118 in 4.2.1 - mm locking order violation - Dom0 reboot
  2013-02-18 11:35 ` Tim Deegan
@ 2013-02-18 13:17   ` Sylvain Munaut
  2013-02-18 15:02     ` Ian Campbell
  2013-02-18 14:47   ` Sylvain Munaut
  1 sibling, 1 reply; 12+ messages in thread
From: Sylvain Munaut @ 2013-02-18 13:17 UTC (permalink / raw)
  To: Tim Deegan, Jan Beulich, Ian Campbell, andrew.cooper3; +Cc: xen-devel

Hi all,

Thanks for the feedback, let me try to answer the various questions.


> Did you make any changes to Xen before you built it, or were you just
> building your own to get 4.2?

It's based on the official .tar.bz2 from the website, and built using
the debian/ from the 4.2.0 debian package. There are some patches
applied in the debian build but I don't see any that patch the actual
code, just small adaptation to the build and install system to follow
the debian conventions.

So it should be functionally equivalent to an official 4.2.1 . I can
put the compiled binary online if needed.


> Please can you share the domain configuration. Are you running PV
> drivers (esp. ballooning) within it?

There is no xen driver running in there.

Here's the config which is based on the example hvm config:

---------
builder = "hvm"
name   = "wxp-00"
vcpus = 2
memory = 1536
maxmem = 2048
viridian = 1
vif = [ 'type=ioemu,bridge=br0,mac=00:16:3e:35:ad:12' ]
disk = [ '/dev/xen-disks/wxp-00-test,raw,xvda,w', ]
on_poweroff = 'destroy'
on_reboot   = 'restart'
on_crash    = 'restart'
vnc=1
vncunused=0
vnclisten = '0.0.0.0'
vncdisplay=0
vncconsole=1
vncpasswd='xxx'
--------


> Do you by any chance have the xen-syms file from when you built Xen?
> That would let us see exactly what's happened.

You can get it at http://ge.tt/7DBEjmY/v/0


> In the meantime, perhaps you could try the attached (untested) patch.
> If my guess is right, it ought to stop the crashes but you might find
> the VM's performance suffers.

I'll try it and report here.


> From the stack trace, I assume that the guest is running in shadow mode
> ?  Can you confirm this?

Sorry, no idea what this means. How can I check / test ?  I didn't
configure anything relative to "shadow mode" at least.


Cheers,

    Sylvain

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Xen BUG at mm-locks.h:118 in 4.2.1 - mm locking order violation - Dom0 reboot
  2013-02-18 11:35 ` Tim Deegan
  2013-02-18 13:17   ` Sylvain Munaut
@ 2013-02-18 14:47   ` Sylvain Munaut
  2013-02-21 15:25     ` Tim Deegan
  1 sibling, 1 reply; 12+ messages in thread
From: Sylvain Munaut @ 2013-02-18 14:47 UTC (permalink / raw)
  To: Tim Deegan; +Cc: xen-devel

Hi,


> In the meantime, perhaps you could try the attached (untested) patch.
> If my guess is right, it ought to stop the crashes but you might find
> the VM's performance suffers.

The patch seems to have fixed this issue.

I did however encounter a :

(XEN) p2m_pod_demand_populate: Dom1 out of PoD memory! (tot=392185
ents=131072 dom0)
(XEN) domain_crash called from p2m-pod.c:1077
(XEN) Domain 1 reported crashed by domain 0 on cpu#1:

The domU then rebooted and it doesn't seem to happen again, but since
it's related to PoD, it might be related to that same issue ...


Cheers,

    Sylvain

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Xen BUG at mm-locks.h:118 in 4.2.1 - mm locking order violation - Dom0 reboot
  2013-02-18 13:17   ` Sylvain Munaut
@ 2013-02-18 15:02     ` Ian Campbell
  2013-02-18 16:31       ` Sylvain Munaut
  0 siblings, 1 reply; 12+ messages in thread
From: Ian Campbell @ 2013-02-18 15:02 UTC (permalink / raw)
  To: Sylvain Munaut; +Cc: Andrew Cooper, Tim (Xen.org), Jan Beulich, xen-devel

On Mon, 2013-02-18 at 13:17 +0000, Sylvain Munaut wrote:

> > Please can you share the domain configuration. Are you running PV
> > drivers (esp. ballooning) within it?
> 
> There is no xen driver running in there.
> 
> Here's the config which is based on the example hvm config:
> 
> ---------
> builder = "hvm"
> name   = "wxp-00"
> vcpus = 2
> memory = 1536
> maxmem = 2048


This is the cause of your second "Dom1 out of PoD memory" bug.

If you aren't running at least a balloon driver inside the guest then
this isn't valid, since you have requested a different initial memory
allocation to what you are actually giving the guest and something needs
to bridge that gap. Initially this is PoD but eventually a balloon
driver must come along, PoD is not intended for use other than during
boot until a balloon driver can be started.

Ian.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Xen BUG at mm-locks.h:118 in 4.2.1 - mm locking order violation - Dom0 reboot
  2013-02-18 15:02     ` Ian Campbell
@ 2013-02-18 16:31       ` Sylvain Munaut
  2013-02-18 16:42         ` Ian Campbell
  0 siblings, 1 reply; 12+ messages in thread
From: Sylvain Munaut @ 2013-02-18 16:31 UTC (permalink / raw)
  To: Ian Campbell; +Cc: Andrew Cooper, Tim (Xen.org), Jan Beulich, xen-devel

Hi Ian,

>> memory = 1536
>> maxmem = 2048
>
>
> This is the cause of your second "Dom1 out of PoD memory" bug.
>
> If you aren't running at least a balloon driver inside the guest then
> this isn't valid, since you have requested a different initial memory
> allocation to what you are actually giving the guest and something needs
> to bridge that gap. Initially this is PoD but eventually a balloon
> driver must come along, PoD is not intended for use other than during
> boot until a balloon driver can be started.

Indeed this fixed it.

Interestingly, it seems to avoid the first issue as well ... I guess
this is why nobody hit that before. Although hard rebooting the dom0
might be a severe punishement for a config mistake :p

Cheers & thanks to all.

    Sylvain

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Xen BUG at mm-locks.h:118 in 4.2.1 - mm locking order violation - Dom0 reboot
  2013-02-18 16:31       ` Sylvain Munaut
@ 2013-02-18 16:42         ` Ian Campbell
  0 siblings, 0 replies; 12+ messages in thread
From: Ian Campbell @ 2013-02-18 16:42 UTC (permalink / raw)
  To: Sylvain Munaut; +Cc: Andrew Cooper, Tim (Xen.org), Jan Beulich, xen-devel

On Mon, 2013-02-18 at 16:31 +0000, Sylvain Munaut wrote:
> Hi Ian,
> 
> >> memory = 1536
> >> maxmem = 2048
> >
> >
> > This is the cause of your second "Dom1 out of PoD memory" bug.
> >
> > If you aren't running at least a balloon driver inside the guest then
> > this isn't valid, since you have requested a different initial memory
> > allocation to what you are actually giving the guest and something needs
> > to bridge that gap. Initially this is PoD but eventually a balloon
> > driver must come along, PoD is not intended for use other than during
> > boot until a balloon driver can be started.
> 
> Indeed this fixed it.

Good.

> Interestingly, it seems to avoid the first issue as well ...

If memory == maxmem then PoD (the crashing subsystem) is never activated
so that is to be expected.

Ian.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Xen BUG at mm-locks.h:118 in 4.2.1 - mm locking order violation - Dom0 reboot
  2013-02-18 14:47   ` Sylvain Munaut
@ 2013-02-21 15:25     ` Tim Deegan
  0 siblings, 0 replies; 12+ messages in thread
From: Tim Deegan @ 2013-02-21 15:25 UTC (permalink / raw)
  To: Sylvain Munaut; +Cc: xen-devel

Hi, 

At 15:47 +0100 on 18 Feb (1361202472), Sylvain Munaut wrote:
> > In the meantime, perhaps you could try the attached (untested) patch.
> > If my guess is right, it ought to stop the crashes but you might find
> > the VM's performance suffers.
> 
> The patch seems to have fixed this issue.

Excellent, thanks.  I've just applied it to xen-unstable.

Tim.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Xen BUG at mm-locks.h:118 in 4.2.1 - mm locking order violation - Dom0 reboot
       [not found] <mailman.24059.1361185990.1399.xen-devel@lists.xen.org>
@ 2013-02-18 14:27 ` Andres Lagar-Cavilla
  0 siblings, 0 replies; 12+ messages in thread
From: Andres Lagar-Cavilla @ 2013-02-18 14:27 UTC (permalink / raw)
  To: xen-devel
  Cc: Ian Campbell, George Dunlap, Andrew Cooper, Sylvain Munaut,
	Tim (Xen.org),
	Jan Beulich

>> Hi,
>> 
>> 
>> I've just installed a self-built Xen 4.2.1 package on a debian wheezy
> 
> Is this exactly 4.2.1, some later revision from 4.2-testing or otherwise
> patches? Can you let us know the comit id.
> 
>> and when trying to run a HVM VM (that I was previously running with
>> the official xen 4.0 package on squeeze), it starts fine and I can
>> even use the VM for a few minutes then suddenly I loose all
>> communication with VM and the Dom0 and it just reboots ...
> 
> Please can you share the domain configuration. Are you running PV
> drivers (esp. ballooning) within it?
> 
>> I enabled the xen serial console and this is what I got when the crash happens:
>> 
>> 
>> (XEN) mm locking order violation: 260 > 222
> 
> 260 == pod lock, 222 is the p2m lock. I've CCd George and Tim.

It's a bad locking interaction between shadow and PoD, introduced in 4.2. The one-line fix is to turn on locking p2m for shadow, as well. But we need to make sure that doing that doesn't introduce other regressions in shadow.

Andres
> 
>> (XEN) Xen BUG at mm-locks.h:118
>> (XEN) ----[ Xen-4.2.1  x86_64  debug=n  Not tainted ]----
>> (XEN) CPU:    0
>> (XEN) RIP:    e008:[<ffff82c4801e15fd>] p2m_pod_demand_populate+0x87d/0x8a0
>> (XEN) RFLAGS: 0000000000010296   CONTEXT: hypervisor
>> (XEN) rax: ffff82c4802e8e20   rbx: ffff8302278ee820   rcx: 0000000000000000
>> (XEN) rdx: ffff82c48029ff18   rsi: 000000000000000a   rdi: ffff82c480258640
>> (XEN) rbp: 0000000000000000   rsp: ffff82c48029f978   r8:  0000000000000004
>> (XEN) r9:  0000000000000003   r10: 0000000000000002   r11: ffff82c4802c8c80
>> (XEN) r12: 0000000000000000   r13: ffff83022795f000   r14: 000000000005f70a
>> (XEN) r15: 000000000005fb0a   cr0: 0000000080050033   cr4: 00000000000026f0
>> (XEN) cr3: 00000002277ac000   cr2: 00000000d8b86058
>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
>> (XEN) Xen stack trace from rsp=ffff82c48029f978:
>> (XEN)    000000000017e26b 000000000005fb0b ffff8302278eed08 000000010a040000
>> (XEN)    ffff82c48029ff18 600000017e26b067 6000000203243267 60000002279be467
>> (XEN)    0000000000000100 0000000000000000 ffff8302278ee820 000000000000a040
>> (XEN)    ffff82c48029faf4 ffff82c48016a4bd 0000000000000000 ffff82c4801d6666
>> (XEN)    0000000000000000 ffff82c48029ff18 0000002000000020 ffff82c48029faf4
>> (XEN)    ffff8302278ee820 ffff82c48029fa70 000000000000a040 000000000005fb0b
>> (XEN)    ffff82c48029fbec 0000000000000000 ffff8000002fd858 ffff8302278ee820
>> (XEN)    0000000000000006 ffff82c4801dbec3 ffff83022795fad0 ffff82c400000001
>> (XEN)    0000000000000001 000000005fb0b000 000000000000a040 806000000010b000
>> (XEN)    6000000172210267 60000002015bd467 ffff83017e26b000 0000000000000001
>> (XEN)    ffff8302278ee820 000000000005fb0b ffff82c48029fbec ffff82c48029fbf4
>> (XEN)    0000000000000000 ffff82c4801d6666 0000000000000000 0000000000000000
>> (XEN)    0000000000001e00 ffff83022795f000 ffff8300d7d10000 000000000005fb0b
>> (XEN)    ffff82c48029ff18 0000000080000b0e ffff8300d7d10000 ffff82c4801fa23f
>> (XEN)    ffff830000000001 ffff83022795f000 0000000000000008 0000000000001e00
>> (XEN)    00007d0a00000006 00000000b9fb2000 000000000003fae9 ffff83022795fb40
>> (XEN)    000000000017ecb9 00000000000b9fb2 ffff82c4802e9c60 ffff83022795fad0
>> (XEN)    ffff8300d7d10920 0000000000000060 ffff82c48029ff18 0000000000000002
>> (XEN)    0000000000000e78 0000000000000000 00000000d7d10000 0000000000000d90
>> (XEN)    0000000000000000 ffff82c4801cdda8 00000004fffe0080 0000000700000000
>> (XEN) Xen call trace:
>> (XEN)    [<ffff82c4801e15fd>] p2m_pod_demand_populate+0x87d/0x8a0
>> (XEN)    [<ffff82c48016a4bd>] get_page+0x2d/0x100
>> (XEN)    [<ffff82c4801d6666>] __get_gfn_type_access+0x86/0x260
>> (XEN)    [<ffff82c4801dbec3>] p2m_gfn_to_mfn+0x693/0x810
>> (XEN)    [<ffff82c4801d6666>] __get_gfn_type_access+0x86/0x260
>> (XEN)    [<ffff82c4801fa23f>] sh_page_fault__guest_3+0x24f/0x1e40
>> (XEN)    [<ffff82c4801cdda8>] vmx_update_guest_cr+0x78/0x5d0
>> (XEN)    [<ffff82c4801ae2da>] hvm_set_cr0+0x2ea/0x480
>> (XEN)    [<ffff82c4801b2bb4>] hvm_mov_to_cr+0xe4/0x1a0
>> (XEN)    [<ffff82c4801cfa63>] vmx_vmexit_handler+0xd33/0x1790
>> (XEN)    [<ffff82c4801cafb5>] vmx_do_resume+0xb5/0x170
>> (XEN)    [<ffff82c48015968c>] context_switch+0x15c/0xdf0
>> (XEN)    [<ffff82c480125d7b>] add_entry+0x4b/0xb0
>> (XEN)    [<ffff82c480125d7b>] add_entry+0x4b/0xb0
>> (XEN)    [<ffff82c4801bf3c7>] pt_update_irq+0x27/0x200
>> (XEN)    [<ffff82c480119830>] csched_tick+0x0/0x2e0
>> (XEN)    [<ffff82c4801bd5a1>] vlapic_has_pending_irq+0x21/0x60
>> (XEN)    [<ffff82c4801b5fca>] hvm_vcpu_has_pending_irq+0x4a/0x90
>> (XEN)    [<ffff82c4801c85c4>] vmx_intr_assist+0x54/0x290
>> (XEN)    [<ffff82c4801d2911>] nvmx_switch_guest+0x51/0x6c0
>> (XEN)    [<ffff82c4801d4256>] vmx_asm_do_vmentry+0x0/0xea
>> (XEN)
>> (XEN)
>> (XEN) ****************************************
>> (XEN) Panic on CPU 0:
>> (XEN) Xen BUG at mm-locks.h:118
>> (XEN) ****************************************
>> (XEN)
>> (XEN) Reboot in five seconds...
>> 
>> 
>> Any suggestions ?
>> 
>> It is very reproducible and it's on a test machine I can reboot any
>> time, so if you need more debug info, I can collect it.
>> I don't have any different hw to test on unfortunately.
>> 
>> 
>> Cheers,
>> 
>>    Sylvain
>> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
> 
> 
> 
> 
> 
> ------------------------------
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 
> 
> End of Xen-devel Digest, Vol 96, Issue 248
> ******************************************

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-02-21 15:25 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-18 10:47 Xen BUG at mm-locks.h:118 in 4.2.1 - mm locking order violation - Dom0 reboot Sylvain Munaut
2013-02-18 11:05 ` Jan Beulich
2013-02-18 11:09 ` Andrew Cooper
2013-02-18 11:13 ` Ian Campbell
2013-02-18 11:35 ` Tim Deegan
2013-02-18 13:17   ` Sylvain Munaut
2013-02-18 15:02     ` Ian Campbell
2013-02-18 16:31       ` Sylvain Munaut
2013-02-18 16:42         ` Ian Campbell
2013-02-18 14:47   ` Sylvain Munaut
2013-02-21 15:25     ` Tim Deegan
     [not found] <mailman.24059.1361185990.1399.xen-devel@lists.xen.org>
2013-02-18 14:27 ` Andres Lagar-Cavilla

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.