All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.29-rc7 pv_ops dom0 BUG / unable to handle kernel paging request
@ 2009-03-07 17:58 Pasi Kärkkäinen
  2009-03-08  6:09 ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 18+ messages in thread
From: Pasi Kärkkäinen @ 2009-03-07 17:58 UTC (permalink / raw)
  To: xen-devel; +Cc: Jeremy Fitzhardinge

Hello!

Latest git tree (updated some hours ago) boots up fine for me, xend can be started etc, 
but some time after starting kernel compilation I get the following BUG:

http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-19-xen331-linux-2.6.29-rc7-bug.txt

(XEN) mm.c:2006:d0 Bad type (saw 28000001 != exp e0000000) for mfn 436a3 (pfn 3d0a3)
(XEN) mm.c:707:d0 Error getting mfn 436a3 (pfn 3d0a3) from L1 entry 00000000436a3063 for dom0
(XEN) mm.c:3640:d0 ptwr_emulate: could not get_page_from_l1e()
BUG: unable to handle kernel paging request at c01cbd58
IP: [<c0405d2f>] xen_set_pte+0x8c/0x96
*pdpt = 000000003d1f0001 
Oops: 0003 [#1] SMP 
last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:06:01.0/class
Modules linked in: ipt_MASQUERADE iptable_nat nf_nat bridge stp bnep sco
l2cap bluetooth sunrpc ipv6 dm_multipath uinput
 ppdev floppy i2c_i801 pcspkr iTCO_wdt i2c_core iTCO_vendor_support
pata_pdc2027x sata_promise parport_pc parport tg3 li
bphy ata_generic pata_acpi [last unloaded: microcode]

Pid: 325, comm: kswapd0 Not tainted (2.6.29-rc7-tip #10) P8SC8
EIP: 0061:[<c0405d2f>] EFLAGS: 00010206 CPU: 0
EIP is at xen_set_pte+0x8c/0x96
EAX: c01cbd58 EBX: 000e133a ECX: 00000000 EDX: 00000000
ESI: 436a3063 EDI: 0003d0a3 EBP: e2152d8c ESP: e2152d78
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
Process kswapd0 (pid: 325, ti=e2152000 task=e219a610 task.ti=e2152000)
Stack:
 00000000 c01cbd58 3d0a3063 c01cbd58 0003d0a3 e2152db4 c0427ff5 00000000
 80000000 00000163 f57ff000 0000000f c19d3460 0000000f c1232000 e2152dcc
 c0404d91 00000163 80000000 436a3067 0a078000 e2152df4 c0489c17 00000000
Call Trace:
 [<c0427ff5>] ? kmap_atomic_prot+0x1cd/0x1ef
 [<c0404d91>] ? xen_kmap_atomic_pte+0x2f/0x36
 [<c0489c17>] ? page_check_address+0x7f/0x131
 [<c0489d71>] ? page_referenced_one+0x4b/0xe9
 [<c048aa4b>] ? page_referenced+0x7d/0xee
 [<c047b3f0>] ? shrink_active_list+0x10b/0x29c
 [<c0422bfe>] ? pvclock_clocksource_read+0x48/0xa3
 [<c0407410>] ? __xen_spin_lock+0xc4/0xd8
 [<c047c27b>] ? shrink_zone+0x285/0x29a
 [<c047c77a>] ? kswapd+0x3b6/0x53d
 [<c047aa07>] ? isolate_pages_global+0x0/0x19e
 [<c04480cf>] ? autoremove_wake_function+0x0/0x33
 [<c047c3c4>] ? kswapd+0x0/0x53d
 [<c0447e03>] ? kthread+0x3b/0x61
 [<c0447dc8>] ? kthread+0x0/0x61
 [<c04097c7>] ? kernel_thread_helper+0x7/0x10
Code: f3 ab c6 05 9c 33 8d c0 00 8b 1d fc 32 8d c0 e8 2d cb 01 00 8b 55 ec
48 0f 94 c0 0f b6 c0 01 d8 a3 fc 32 8d c0 8b 
45 f0 89 50 04 <89> 30 8d 65 f4 5b 5e 5f 5d c3 55 89 e5 57 56 89 c6 53 89 d3
83 
EIP: [<c0405d2f>] xen_set_pte+0x8c/0x96 SS:ESP 0069:e2152d78
CR2: 00000000c01cbd58
---[ end trace b993ac68a500f37e ]---
..
.. 

.. and a lot of other stuff after this.

-- Pasi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.29-rc7 pv_ops dom0 BUG / unable to handle kernel paging request
  2009-03-07 17:58 2.6.29-rc7 pv_ops dom0 BUG / unable to handle kernel paging request Pasi Kärkkäinen
@ 2009-03-08  6:09 ` Jeremy Fitzhardinge
  2009-03-08 11:54   ` Pasi Kärkkäinen
  0 siblings, 1 reply; 18+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-08  6:09 UTC (permalink / raw)
  To: Pasi Kärkkäinen; +Cc: xen-devel

Pasi Kärkkäinen wrote:
> Hello!
>
> Latest git tree (updated some hours ago) boots up fine for me, xend can be started etc, 
> but some time after starting kernel compilation I get the following BUG:
>
> http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-19-xen331-linux-2.6.29-rc7-bug.txt
>
> (XEN) mm.c:2006:d0 Bad type (saw 28000001 != exp e0000000) for mfn 436a3 (pfn 3d0a3)
> (XEN) mm.c:707:d0 Error getting mfn 436a3 (pfn 3d0a3) from L1 entry 00000000436a3063 for dom0
> (XEN) mm.c:3640:d0 ptwr_emulate: could not get_page_from_l1e()
> BUG: unable to handle kernel paging request at c01cbd58
> IP: [<c0405d2f>] xen_set_pte+0x8c/0x96
>   

Well, that's bad news.  It was trying to map a highmem pte page which 
didn't have the Pinned bit set on its page, but Xen thought it was a 
pinned pte page.  Not sure how it could get into that state, but its 
indicative of general memory corruption.

What was going on at the time?  Was dom0 busy?  Were you running some domUs?

    J

> *pdpt = 000000003d1f0001 
> Oops: 0003 [#1] SMP 
> last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:06:01.0/class
> Modules linked in: ipt_MASQUERADE iptable_nat nf_nat bridge stp bnep sco
> l2cap bluetooth sunrpc ipv6 dm_multipath uinput
>  ppdev floppy i2c_i801 pcspkr iTCO_wdt i2c_core iTCO_vendor_support
> pata_pdc2027x sata_promise parport_pc parport tg3 li
> bphy ata_generic pata_acpi [last unloaded: microcode]
>
> Pid: 325, comm: kswapd0 Not tainted (2.6.29-rc7-tip #10) P8SC8
> EIP: 0061:[<c0405d2f>] EFLAGS: 00010206 CPU: 0
> EIP is at xen_set_pte+0x8c/0x96
> EAX: c01cbd58 EBX: 000e133a ECX: 00000000 EDX: 00000000
> ESI: 436a3063 EDI: 0003d0a3 EBP: e2152d8c ESP: e2152d78
>  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
> Process kswapd0 (pid: 325, ti=e2152000 task=e219a610 task.ti=e2152000)
> Stack:
>  00000000 c01cbd58 3d0a3063 c01cbd58 0003d0a3 e2152db4 c0427ff5 00000000
>  80000000 00000163 f57ff000 0000000f c19d3460 0000000f c1232000 e2152dcc
>  c0404d91 00000163 80000000 436a3067 0a078000 e2152df4 c0489c17 00000000
> Call Trace:
>  [<c0427ff5>] ? kmap_atomic_prot+0x1cd/0x1ef
>  [<c0404d91>] ? xen_kmap_atomic_pte+0x2f/0x36
>  [<c0489c17>] ? page_check_address+0x7f/0x131
>  [<c0489d71>] ? page_referenced_one+0x4b/0xe9
>  [<c048aa4b>] ? page_referenced+0x7d/0xee
>  [<c047b3f0>] ? shrink_active_list+0x10b/0x29c
>  [<c0422bfe>] ? pvclock_clocksource_read+0x48/0xa3
>  [<c0407410>] ? __xen_spin_lock+0xc4/0xd8
>  [<c047c27b>] ? shrink_zone+0x285/0x29a
>  [<c047c77a>] ? kswapd+0x3b6/0x53d
>  [<c047aa07>] ? isolate_pages_global+0x0/0x19e
>  [<c04480cf>] ? autoremove_wake_function+0x0/0x33
>  [<c047c3c4>] ? kswapd+0x0/0x53d
>  [<c0447e03>] ? kthread+0x3b/0x61
>  [<c0447dc8>] ? kthread+0x0/0x61
>  [<c04097c7>] ? kernel_thread_helper+0x7/0x10
> Code: f3 ab c6 05 9c 33 8d c0 00 8b 1d fc 32 8d c0 e8 2d cb 01 00 8b 55 ec
> 48 0f 94 c0 0f b6 c0 01 d8 a3 fc 32 8d c0 8b 
> 45 f0 89 50 04 <89> 30 8d 65 f4 5b 5e 5f 5d c3 55 89 e5 57 56 89 c6 53 89 d3
> 83 
> EIP: [<c0405d2f>] xen_set_pte+0x8c/0x96 SS:ESP 0069:e2152d78
> CR2: 00000000c01cbd58
> ---[ end trace b993ac68a500f37e ]---
> ..
> .. 
>
> .. and a lot of other stuff after this.
>
> -- Pasi
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>   

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.29-rc7 pv_ops dom0 BUG / unable to handle kernel paging request
  2009-03-08  6:09 ` Jeremy Fitzhardinge
@ 2009-03-08 11:54   ` Pasi Kärkkäinen
  2009-03-11 20:52     ` Pasi Kärkkäinen
  0 siblings, 1 reply; 18+ messages in thread
From: Pasi Kärkkäinen @ 2009-03-08 11:54 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen-devel

On Sat, Mar 07, 2009 at 10:09:16PM -0800, Jeremy Fitzhardinge wrote:
> Pasi Kärkkäinen wrote:
> >Hello!
> >
> >Latest git tree (updated some hours ago) boots up fine for me, xend can be 
> >started etc, but some time after starting kernel compilation I get the 
> >following BUG:
> >
> >http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-19-xen331-linux-2.6.29-rc7-bug.txt
> >
> >(XEN) mm.c:2006:d0 Bad type (saw 28000001 != exp e0000000) for mfn 436a3 
> >(pfn 3d0a3)
> >(XEN) mm.c:707:d0 Error getting mfn 436a3 (pfn 3d0a3) from L1 entry 
> >00000000436a3063 for dom0
> >(XEN) mm.c:3640:d0 ptwr_emulate: could not get_page_from_l1e()
> >BUG: unable to handle kernel paging request at c01cbd58
> >IP: [<c0405d2f>] xen_set_pte+0x8c/0x96
> >  
> 
> Well, that's bad news.  It was trying to map a highmem pte page which 
> didn't have the Pinned bit set on its page, but Xen thought it was a 
> pinned pte page.  Not sure how it could get into that state, but its 
> indicative of general memory corruption.
> 
> What was going on at the time?  Was dom0 busy?  Were you running some domUs?
> 

At that time I was compiling a kernel on dom0.. so dom0 was busy.

No other domains running. 

-- Pasi

>    J
> 
> >*pdpt = 000000003d1f0001 
> >Oops: 0003 [#1] SMP 
> >last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:06:01.0/class
> >Modules linked in: ipt_MASQUERADE iptable_nat nf_nat bridge stp bnep sco
> >l2cap bluetooth sunrpc ipv6 dm_multipath uinput
> > ppdev floppy i2c_i801 pcspkr iTCO_wdt i2c_core iTCO_vendor_support
> >pata_pdc2027x sata_promise parport_pc parport tg3 li
> >bphy ata_generic pata_acpi [last unloaded: microcode]
> >
> >Pid: 325, comm: kswapd0 Not tainted (2.6.29-rc7-tip #10) P8SC8
> >EIP: 0061:[<c0405d2f>] EFLAGS: 00010206 CPU: 0
> >EIP is at xen_set_pte+0x8c/0x96
> >EAX: c01cbd58 EBX: 000e133a ECX: 00000000 EDX: 00000000
> >ESI: 436a3063 EDI: 0003d0a3 EBP: e2152d8c ESP: e2152d78
> > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
> >Process kswapd0 (pid: 325, ti=e2152000 task=e219a610 task.ti=e2152000)
> >Stack:
> > 00000000 c01cbd58 3d0a3063 c01cbd58 0003d0a3 e2152db4 c0427ff5 00000000
> > 80000000 00000163 f57ff000 0000000f c19d3460 0000000f c1232000 e2152dcc
> > c0404d91 00000163 80000000 436a3067 0a078000 e2152df4 c0489c17 00000000
> >Call Trace:
> > [<c0427ff5>] ? kmap_atomic_prot+0x1cd/0x1ef
> > [<c0404d91>] ? xen_kmap_atomic_pte+0x2f/0x36
> > [<c0489c17>] ? page_check_address+0x7f/0x131
> > [<c0489d71>] ? page_referenced_one+0x4b/0xe9
> > [<c048aa4b>] ? page_referenced+0x7d/0xee
> > [<c047b3f0>] ? shrink_active_list+0x10b/0x29c
> > [<c0422bfe>] ? pvclock_clocksource_read+0x48/0xa3
> > [<c0407410>] ? __xen_spin_lock+0xc4/0xd8
> > [<c047c27b>] ? shrink_zone+0x285/0x29a
> > [<c047c77a>] ? kswapd+0x3b6/0x53d
> > [<c047aa07>] ? isolate_pages_global+0x0/0x19e
> > [<c04480cf>] ? autoremove_wake_function+0x0/0x33
> > [<c047c3c4>] ? kswapd+0x0/0x53d
> > [<c0447e03>] ? kthread+0x3b/0x61
> > [<c0447dc8>] ? kthread+0x0/0x61
> > [<c04097c7>] ? kernel_thread_helper+0x7/0x10
> >Code: f3 ab c6 05 9c 33 8d c0 00 8b 1d fc 32 8d c0 e8 2d cb 01 00 8b 55 ec
> >48 0f 94 c0 0f b6 c0 01 d8 a3 fc 32 8d c0 8b 
> >45 f0 89 50 04 <89> 30 8d 65 f4 5b 5e 5f 5d c3 55 89 e5 57 56 89 c6 53 89 
> >d3
> >83 
> >EIP: [<c0405d2f>] xen_set_pte+0x8c/0x96 SS:ESP 0069:e2152d78
> >CR2: 00000000c01cbd58
> >---[ end trace b993ac68a500f37e ]---
> >..
> >.. 
> >
> >.. and a lot of other stuff after this.
> >
> >-- Pasi
> >

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.29-rc7 pv_ops dom0 BUG / unable to handle kernel paging request
  2009-03-08 11:54   ` Pasi Kärkkäinen
@ 2009-03-11 20:52     ` Pasi Kärkkäinen
  2009-03-11 21:26       ` Pasi Kärkkäinen
  0 siblings, 1 reply; 18+ messages in thread
From: Pasi Kärkkäinen @ 2009-03-11 20:52 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen-devel

On Sun, Mar 08, 2009 at 01:54:01PM +0200, Pasi Kärkkäinen wrote:
> On Sat, Mar 07, 2009 at 10:09:16PM -0800, Jeremy Fitzhardinge wrote:
> > Pasi Kärkkäinen wrote:
> > >Hello!
> > >
> > >Latest git tree (updated some hours ago) boots up fine for me, xend can be 
> > >started etc, but some time after starting kernel compilation I get the 
> > >following BUG:
> > >
> > >http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-19-xen331-linux-2.6.29-rc7-bug.txt
> > >
> > >(XEN) mm.c:2006:d0 Bad type (saw 28000001 != exp e0000000) for mfn 436a3 
> > >(pfn 3d0a3)
> > >(XEN) mm.c:707:d0 Error getting mfn 436a3 (pfn 3d0a3) from L1 entry 
> > >00000000436a3063 for dom0
> > >(XEN) mm.c:3640:d0 ptwr_emulate: could not get_page_from_l1e()
> > >BUG: unable to handle kernel paging request at c01cbd58
> > >IP: [<c0405d2f>] xen_set_pte+0x8c/0x96
> > >  
> > 
> > Well, that's bad news.  It was trying to map a highmem pte page which 
> > didn't have the Pinned bit set on its page, but Xen thought it was a 
> > pinned pte page.  Not sure how it could get into that state, but its 
> > indicative of general memory corruption.
> > 
> > What was going on at the time?  Was dom0 busy?  Were you running some domUs?
> > 
> 
> At that time I was compiling a kernel on dom0.. so dom0 was busy.
> 
> No other domains running. 
> 

Same thing happened again.. with the latest tree (as of today).

Again I was compiling a kernel in dom0, so dom0 was busy. No other domains
running. 

Complete bootlog including the BUG/oops:
http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-20-xen331-linux-2.6.29-rc7-bug.txt

(XEN) mm.c:2006:d0 Bad type (saw 28000001 != exp e0000000) for mfn 4c0a0 (pfn 346a0)
(XEN) mm.c:707:d0 Error getting mfn 4c0a0 (pfn 346a0) from L1 entry 000000004c0a0063 for dom0
(XEN) mm.c:3640:d0 ptwr_emulate: could not get_page_from_l1e()
BUG: unable to handle kernel paging request at c01cbd58
IP: [<c0405d35>] xen_set_pte+0x8c/0x96
*pdpt = 000000003c981001 
Oops: 0003 [#1] SMP 

-- Pasi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.29-rc7 pv_ops dom0 BUG / unable to handle kernel paging request
  2009-03-11 20:52     ` Pasi Kärkkäinen
@ 2009-03-11 21:26       ` Pasi Kärkkäinen
  2009-03-11 23:40         ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 18+ messages in thread
From: Pasi Kärkkäinen @ 2009-03-11 21:26 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen-devel

On Wed, Mar 11, 2009 at 10:52:50PM +0200, Pasi Kärkkäinen wrote:
> On Sun, Mar 08, 2009 at 01:54:01PM +0200, Pasi Kärkkäinen wrote:
> > On Sat, Mar 07, 2009 at 10:09:16PM -0800, Jeremy Fitzhardinge wrote:
> > > Pasi Kärkkäinen wrote:
> > > >Hello!
> > > >
> > > >Latest git tree (updated some hours ago) boots up fine for me, xend can be 
> > > >started etc, but some time after starting kernel compilation I get the 
> > > >following BUG:
> > > >
> > > >http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-19-xen331-linux-2.6.29-rc7-bug.txt
> > > >
> > > >(XEN) mm.c:2006:d0 Bad type (saw 28000001 != exp e0000000) for mfn 436a3 
> > > >(pfn 3d0a3)
> > > >(XEN) mm.c:707:d0 Error getting mfn 436a3 (pfn 3d0a3) from L1 entry 
> > > >00000000436a3063 for dom0
> > > >(XEN) mm.c:3640:d0 ptwr_emulate: could not get_page_from_l1e()
> > > >BUG: unable to handle kernel paging request at c01cbd58
> > > >IP: [<c0405d2f>] xen_set_pte+0x8c/0x96
> > > >  
> > > 
> > > Well, that's bad news.  It was trying to map a highmem pte page which 
> > > didn't have the Pinned bit set on its page, but Xen thought it was a 
> > > pinned pte page.  Not sure how it could get into that state, but its 
> > > indicative of general memory corruption.
> > > 
> > > What was going on at the time?  Was dom0 busy?  Were you running some domUs?
> > > 
> > 
> > At that time I was compiling a kernel on dom0.. so dom0 was busy.
> > 
> > No other domains running. 
> > 
> 
> Same thing happened again.. with the latest tree (as of today).
> 
> Again I was compiling a kernel in dom0, so dom0 was busy. No other domains
> running. 
> 

The exact same kernel booted on baremetal without Xen works OK, and doesn't
BUG during kernel compilation. 

-- Pasi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.29-rc7 pv_ops dom0 BUG / unable to handle kernel paging request
  2009-03-11 21:26       ` Pasi Kärkkäinen
@ 2009-03-11 23:40         ` Jeremy Fitzhardinge
  2009-03-12  8:32           ` Pasi Kärkkäinen
  0 siblings, 1 reply; 18+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-11 23:40 UTC (permalink / raw)
  To: Pasi Kärkkäinen; +Cc: xen-devel

Pasi Kärkkäinen wrote:
> On Wed, Mar 11, 2009 at 10:52:50PM +0200, Pasi Kärkkäinen wrote:
>   
>> On Sun, Mar 08, 2009 at 01:54:01PM +0200, Pasi Kärkkäinen wrote:
>>     
>>> On Sat, Mar 07, 2009 at 10:09:16PM -0800, Jeremy Fitzhardinge wrote:
>>>       
>>>> Pasi Kärkkäinen wrote:
>>>>         
>>>>> Hello!
>>>>>
>>>>> Latest git tree (updated some hours ago) boots up fine for me, xend can be 
>>>>> started etc, but some time after starting kernel compilation I get the 
>>>>> following BUG:
>>>>>
>>>>> http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-19-xen331-linux-2.6.29-rc7-bug.txt
>>>>>
>>>>> (XEN) mm.c:2006:d0 Bad type (saw 28000001 != exp e0000000) for mfn 436a3 
>>>>> (pfn 3d0a3)
>>>>> (XEN) mm.c:707:d0 Error getting mfn 436a3 (pfn 3d0a3) from L1 entry 
>>>>> 00000000436a3063 for dom0
>>>>> (XEN) mm.c:3640:d0 ptwr_emulate: could not get_page_from_l1e()
>>>>> BUG: unable to handle kernel paging request at c01cbd58
>>>>> IP: [<c0405d2f>] xen_set_pte+0x8c/0x96
>>>>>  
>>>>>           
>>>> Well, that's bad news.  It was trying to map a highmem pte page which 
>>>> didn't have the Pinned bit set on its page, but Xen thought it was a 
>>>> pinned pte page.  Not sure how it could get into that state, but its 
>>>> indicative of general memory corruption.
>>>>
>>>> What was going on at the time?  Was dom0 busy?  Were you running some domUs?
>>>>
>>>>         
>>> At that time I was compiling a kernel on dom0.. so dom0 was busy.
>>>
>>> No other domains running. 
>>>
>>>       
>> Same thing happened again.. with the latest tree (as of today).
>>
>> Again I was compiling a kernel in dom0, so dom0 was busy. No other domains
>> running. 
>>
>>     
>
> The exact same kernel booted on baremetal without Xen works OK, and doesn't
> BUG during kernel compilation. 
>
>   

That's good to know as a baseline, though the bug is Xen refusing to do 
a pte update, so I'd be surprised if it happened with Xen in the picture ;)

What happens if you disable CONFIG_HIGHPTE?

    J

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.29-rc7 pv_ops dom0 BUG / unable to handle kernel paging request
  2009-03-11 23:40         ` Jeremy Fitzhardinge
@ 2009-03-12  8:32           ` Pasi Kärkkäinen
  2009-03-20 18:10             ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 18+ messages in thread
From: Pasi Kärkkäinen @ 2009-03-12  8:32 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen-devel

On Wed, Mar 11, 2009 at 04:40:24PM -0700, Jeremy Fitzhardinge wrote:
> Pasi Kärkkäinen wrote:
> >On Wed, Mar 11, 2009 at 10:52:50PM +0200, Pasi Kärkkäinen wrote:
> >  
> >>On Sun, Mar 08, 2009 at 01:54:01PM +0200, Pasi Kärkkäinen wrote:
> >>    
> >>>On Sat, Mar 07, 2009 at 10:09:16PM -0800, Jeremy Fitzhardinge wrote:
> >>>      
> >>>>Pasi Kärkkäinen wrote:
> >>>>        
> >>>>>Hello!
> >>>>>
> >>>>>Latest git tree (updated some hours ago) boots up fine for me, xend 
> >>>>>can be started etc, but some time after starting kernel compilation I 
> >>>>>get the following BUG:
> >>>>>
> >>>>>http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-19-xen331-linux-2.6.29-rc7-bug.txt
> >>>>>
> >>>>>(XEN) mm.c:2006:d0 Bad type (saw 28000001 != exp e0000000) for mfn 
> >>>>>436a3 (pfn 3d0a3)
> >>>>>(XEN) mm.c:707:d0 Error getting mfn 436a3 (pfn 3d0a3) from L1 entry 
> >>>>>00000000436a3063 for dom0
> >>>>>(XEN) mm.c:3640:d0 ptwr_emulate: could not get_page_from_l1e()
> >>>>>BUG: unable to handle kernel paging request at c01cbd58
> >>>>>IP: [<c0405d2f>] xen_set_pte+0x8c/0x96
> >>>>> 
> >>>>>          
> >>>>Well, that's bad news.  It was trying to map a highmem pte page which 
> >>>>didn't have the Pinned bit set on its page, but Xen thought it was a 
> >>>>pinned pte page.  Not sure how it could get into that state, but its 
> >>>>indicative of general memory corruption.
> >>>>
> >>>>What was going on at the time?  Was dom0 busy?  Were you running some 
> >>>>domUs?
> >>>>
> >>>>        
> >>>At that time I was compiling a kernel on dom0.. so dom0 was busy.
> >>>
> >>>No other domains running. 
> >>>
> >>>      
> >>Same thing happened again.. with the latest tree (as of today).
> >>
> >>Again I was compiling a kernel in dom0, so dom0 was busy. No other domains
> >>running. 
> >>
> >>    
> >
> >The exact same kernel booted on baremetal without Xen works OK, and doesn't
> >BUG during kernel compilation. 
> >
> >  
> 
> That's good to know as a baseline, though the bug is Xen refusing to do 
> a pte update, so I'd be surprised if it happened with Xen in the picture ;)
> 
> What happens if you disable CONFIG_HIGHPTE?
> 

Ok. I'll try that later.. I'll be away for 1,5 weeks (going to a vacation), 
so can't try right now. 

One difference just came to my mind.. when booted on baremetal, whole 2G of
ram is available.. when booted as dom0 I've limited dom0 memory to 1G. 

I guess I should try with the same amount of memory available for both
dom0 and baremetal.

-- Pasi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.29-rc7 pv_ops dom0 BUG / unable to handle kernel paging request
  2009-03-12  8:32           ` Pasi Kärkkäinen
@ 2009-03-20 18:10             ` Jeremy Fitzhardinge
  2009-03-20 19:04               ` dom0_mem with 2.6.29-rc8 Boris Derzhavets
  2009-03-21 20:16               ` 2.6.29-rc7 pv_ops dom0 BUG / unable to handle kernel paging request Pasi Kärkkäinen
  0 siblings, 2 replies; 18+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-20 18:10 UTC (permalink / raw)
  To: Pasi Kärkkäinen; +Cc: xen-devel

Pasi Kärkkäinen wrote:
> Ok. I'll try that later.. I'll be away for 1,5 weeks (going to a vacation), 
> so can't try right now. 
>
> One difference just came to my mind.. when booted on baremetal, whole 2G of
> ram is available.. when booted as dom0 I've limited dom0 memory to 1G. 
>   

That shouldn't be necessary any more.

> I guess I should try with the same amount of memory available for both
> dom0 and baremetal.
>   

I would expect the results to be the same, though the xen case might 
fail more readily.

Also, do you see this problem before you've started any other domains?  
Or does it only happen once you've run a domU (or only while a domU is 
running)?

Thanks,
    J

^ permalink raw reply	[flat|nested] 18+ messages in thread

* dom0_mem  with 2.6.29-rc8
  2009-03-20 18:10             ` Jeremy Fitzhardinge
@ 2009-03-20 19:04               ` Boris Derzhavets
  2009-03-20 19:35                 ` Jeremy Fitzhardinge
  2009-03-21 20:16               ` 2.6.29-rc7 pv_ops dom0 BUG / unable to handle kernel paging request Pasi Kärkkäinen
  1 sibling, 1 reply; 18+ messages in thread
From: Boris Derzhavets @ 2009-03-20 19:04 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 369 bytes --]

System loaded with grub entry:-

title Xen 3.4 / Ubuntu 8.10, kernel 2.6.29-tip
uuid    9efba9a5-9f2b-4bf6-b8b5-7d6d53eb02d9
kernel  /boot/xen-3.4.gz
module  /boot/vmlinuz-2.6.29-rc8-tip root=/dev/sdb14 ro console=tty0
module  /boot/initrd-2.6.29-rc8-tip.img

seems to allocate required amount of memory for DomUs with no problems.

Boris.






      

[-- Attachment #1.2: Type: text/html, Size: 542 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: dom0_mem  with 2.6.29-rc8
  2009-03-20 19:04               ` dom0_mem with 2.6.29-rc8 Boris Derzhavets
@ 2009-03-20 19:35                 ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 18+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-20 19:35 UTC (permalink / raw)
  To: bderzhavets; +Cc: xen-devel

Boris Derzhavets wrote:
> System loaded with grub entry:-
>
> title Xen 3.4 / Ubuntu 8.10, kernel 2.6.29-tip
> uuid    9efba9a5-9f2b-4bf6-b8b5-7d6d53eb02d9
> kernel  /boot/xen-3.4.gz
> module  /boot/vmlinuz-2.6.29-rc8-tip root=/dev/sdb14 ro console=tty0
> module  /boot/initrd-2.6.29-rc8-tip.img
>
> seems to allocate required amount of memory for DomUs with no problems.
>

Yes.  Originally dom0_mem was needed to make sure that the dom0 memory 
didn't overlap the pci window in the physical address space, but that's 
no longer needed.  Ballooning takes care of shrinking dom0 to make space 
for domUs (though in a dedicated Xen system, starting dom0 with a 
limited amount of memory to leave a well-defined amount of space for domUs.

    J

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.29-rc7 pv_ops dom0 BUG / unable to handle kernel paging request
  2009-03-20 18:10             ` Jeremy Fitzhardinge
  2009-03-20 19:04               ` dom0_mem with 2.6.29-rc8 Boris Derzhavets
@ 2009-03-21 20:16               ` Pasi Kärkkäinen
  2009-03-21 22:50                 ` Pasi Kärkkäinen
  1 sibling, 1 reply; 18+ messages in thread
From: Pasi Kärkkäinen @ 2009-03-21 20:16 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen-devel

On Fri, Mar 20, 2009 at 11:10:10AM -0700, Jeremy Fitzhardinge wrote:
> Pasi Kärkkäinen wrote:
> >Ok. I'll try that later.. I'll be away for 1,5 weeks (going to a 
> >vacation), so can't try right now. 
> >
> >One difference just came to my mind.. when booted on baremetal, whole 2G of
> >ram is available.. when booted as dom0 I've limited dom0 memory to 1G. 
> >  
> 
> That shouldn't be necessary any more.
> 

Ok. I'm just used to always limiting dom0 memory to avoid ballooning :)

> >I guess I should try with the same amount of memory available for both
> >dom0 and baremetal.
> >  
> 
> I would expect the results to be the same, though the xen case might 
> fail more readily.
> 

Baremetal is not failing.. ie. I don't see this BUG when I run the same
kernel on baremetal.  

> Also, do you see this problem before you've started any other domains?  
> Or does it only happen once you've run a domU (or only while a domU is 
> running)?
> 

I'm not running any other domains.. Only dom0 is running.

Steps to reproduce this BUG on my pv_ops dom0 testbox:

1) Reboot the box to pv_ops dom0 kernel
2) Login to dom0 via ssh
3) Start kernel compilation on dom0 (make bzImage && make modules)
4) Wait some minutes and pv_ops dom0 kernel BUGs

So no other domains has been or is running when this happens..

I'll try disabling CONFIG_HIGHPTE now, and see if that makes any difference.

-- Pasi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.29-rc7 pv_ops dom0 BUG / unable to handle kernel paging request
  2009-03-21 20:16               ` 2.6.29-rc7 pv_ops dom0 BUG / unable to handle kernel paging request Pasi Kärkkäinen
@ 2009-03-21 22:50                 ` Pasi Kärkkäinen
  2009-03-21 23:13                   ` 2.6.29-rc8 " Pasi Kärkkäinen
  0 siblings, 1 reply; 18+ messages in thread
From: Pasi Kärkkäinen @ 2009-03-21 22:50 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen-devel

On Sat, Mar 21, 2009 at 10:16:52PM +0200, Pasi Kärkkäinen wrote:
> 
> > Also, do you see this problem before you've started any other domains?  
> > Or does it only happen once you've run a domU (or only while a domU is 
> > running)?
> > 
> 
> I'm not running any other domains.. Only dom0 is running.
> 
> Steps to reproduce this BUG on my pv_ops dom0 testbox:
> 
> 1) Reboot the box to pv_ops dom0 kernel
> 2) Login to dom0 via ssh
> 3) Start kernel compilation on dom0 (make bzImage && make modules)
> 4) Wait some minutes and pv_ops dom0 kernel BUGs
> 
> So no other domains has been or is running when this happens..
> 
> I'll try disabling CONFIG_HIGHPTE now, and see if that makes any difference.
> 

CONFIG_HIGHPTE=y and pv_ops dom0 survives up for maybe 30 mins, and then
BUGs (during kernel compilation):
http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-22-xen331-linux-2.6.29-rc8-bug-with-highpte.txt


CONFIG_HIGHPTE=n and I get BUG during system startup when udev is started:
http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte.txt

Starting udev: BUG: unable to handle kernel paging request at 70007823
IP: [<e30ce245>] pdc_common_ops+0x171/0xfffffcfe [sata_promise]
*pdpt = 000000005f781001 
Oops: 0002 [#1] SMP 

So yeah..  with CONFIG_HIGHPTE=n it seems to happen when sata_promise is loaded.. What should I try next? 

-- Pasi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.29-rc8 pv_ops dom0 BUG / unable to handle kernel paging request
  2009-03-21 22:50                 ` Pasi Kärkkäinen
@ 2009-03-21 23:13                   ` Pasi Kärkkäinen
  2009-03-22  4:28                     ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 18+ messages in thread
From: Pasi Kärkkäinen @ 2009-03-21 23:13 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen-devel

On Sun, Mar 22, 2009 at 12:50:31AM +0200, Pasi Kärkkäinen wrote:
> On Sat, Mar 21, 2009 at 10:16:52PM +0200, Pasi Kärkkäinen wrote:
> > 
> > > Also, do you see this problem before you've started any other domains?  
> > > Or does it only happen once you've run a domU (or only while a domU is 
> > > running)?
> > > 
> > 
> > I'm not running any other domains.. Only dom0 is running.
> > 
> > Steps to reproduce this BUG on my pv_ops dom0 testbox:
> > 
> > 1) Reboot the box to pv_ops dom0 kernel
> > 2) Login to dom0 via ssh
> > 3) Start kernel compilation on dom0 (make bzImage && make modules)
> > 4) Wait some minutes and pv_ops dom0 kernel BUGs
> > 
> > So no other domains has been or is running when this happens..
> > 
> > I'll try disabling CONFIG_HIGHPTE now, and see if that makes any difference.
> > 
> 
> CONFIG_HIGHPTE=y and pv_ops dom0 survives up for maybe 30 mins, and then
> BUGs (during kernel compilation):
> http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-22-xen331-linux-2.6.29-rc8-bug-with-highpte.txt
> 
> 
> CONFIG_HIGHPTE=n and I get BUG during system startup when udev is started:
> http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte.txt
> 
> Starting udev: BUG: unable to handle kernel paging request at 70007823
> IP: [<e30ce245>] pdc_common_ops+0x171/0xfffffcfe [sata_promise]
> *pdpt = 000000005f781001 
> Oops: 0002 [#1] SMP 
> 
> So yeah..  with CONFIG_HIGHPTE=n it seems to happen when sata_promise is loaded.. What should I try next? 
> 

Actually it's not only sata_promise. I tried 2 more times with the
CONFIG_HIGHPTE=n pv_ops dom0 kernel:

http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte.txt
http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte-2.txt
http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte-3.txt

BUG: unable to handle kernel paging request at a536462c
IP: [<e30f4278>] classes+0x688/0xfffffa30 [parport]
*pdpt = 000000005f759001 
Oops: 0002 [#1] SMP 

.. and then the next time (-3.txt) it was sata_promise again triggering that BUG.. 

-- Pasi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.29-rc8 pv_ops dom0 BUG / unable to handle kernel paging request
  2009-03-21 23:13                   ` 2.6.29-rc8 " Pasi Kärkkäinen
@ 2009-03-22  4:28                     ` Jeremy Fitzhardinge
  2009-03-22 11:51                       ` Pasi Kärkkäinen
  0 siblings, 1 reply; 18+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-22  4:28 UTC (permalink / raw)
  To: Pasi Kärkkäinen; +Cc: xen-devel

Pasi Kärkkäinen wrote:
> On Sun, Mar 22, 2009 at 12:50:31AM +0200, Pasi Kärkkäinen wrote:
>   
>> On Sat, Mar 21, 2009 at 10:16:52PM +0200, Pasi Kärkkäinen wrote:
>>     
>>>> Also, do you see this problem before you've started any other domains?  
>>>> Or does it only happen once you've run a domU (or only while a domU is 
>>>> running)?
>>>>
>>>>         
>>> I'm not running any other domains.. Only dom0 is running.
>>>
>>> Steps to reproduce this BUG on my pv_ops dom0 testbox:
>>>
>>> 1) Reboot the box to pv_ops dom0 kernel
>>> 2) Login to dom0 via ssh
>>> 3) Start kernel compilation on dom0 (make bzImage && make modules)
>>> 4) Wait some minutes and pv_ops dom0 kernel BUGs
>>>
>>> So no other domains has been or is running when this happens..
>>>
>>> I'll try disabling CONFIG_HIGHPTE now, and see if that makes any difference.
>>>
>>>       
>> CONFIG_HIGHPTE=y and pv_ops dom0 survives up for maybe 30 mins, and then
>> BUGs (during kernel compilation):
>> http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-22-xen331-linux-2.6.29-rc8-bug-with-highpte.txt
>>
>>
>> CONFIG_HIGHPTE=n and I get BUG during system startup when udev is started:
>> http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte.txt
>>
>> Starting udev: BUG: unable to handle kernel paging request at 70007823
>> IP: [<e30ce245>] pdc_common_ops+0x171/0xfffffcfe [sata_promise]
>> *pdpt = 000000005f781001 
>> Oops: 0002 [#1] SMP 
>>
>> So yeah..  with CONFIG_HIGHPTE=n it seems to happen when sata_promise is loaded.. What should I try next? 
>>
>>     
>
> Actually it's not only sata_promise. I tried 2 more times with the
> CONFIG_HIGHPTE=n pv_ops dom0 kernel:
>
> http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte.txt
> http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte-2.txt
> http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte-3.txt
>
> BUG: unable to handle kernel paging request at a536462c
> IP: [<e30f4278>] classes+0x688/0xfffffa30 [parport]
> *pdpt = 000000005f759001 
> Oops: 0002 [#1] SMP 
>   

Hm, OK.  Something is clearly drastically amiss.  I'll try to repro.

    J

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.29-rc8 pv_ops dom0 BUG / unable to handle kernel paging request
  2009-03-22  4:28                     ` Jeremy Fitzhardinge
@ 2009-03-22 11:51                       ` Pasi Kärkkäinen
  2009-03-22 17:04                         ` Pasi Kärkkäinen
  0 siblings, 1 reply; 18+ messages in thread
From: Pasi Kärkkäinen @ 2009-03-22 11:51 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen-devel

On Sat, Mar 21, 2009 at 09:28:55PM -0700, Jeremy Fitzhardinge wrote:
> Pasi Kärkkäinen wrote:
> >On Sun, Mar 22, 2009 at 12:50:31AM +0200, Pasi Kärkkäinen wrote:
> >  
> >>On Sat, Mar 21, 2009 at 10:16:52PM +0200, Pasi Kärkkäinen wrote:
> >>    
> >>>>Also, do you see this problem before you've started any other domains?  
> >>>>Or does it only happen once you've run a domU (or only while a domU is 
> >>>>running)?
> >>>>
> >>>>        
> >>>I'm not running any other domains.. Only dom0 is running.
> >>>
> >>>Steps to reproduce this BUG on my pv_ops dom0 testbox:
> >>>
> >>>1) Reboot the box to pv_ops dom0 kernel
> >>>2) Login to dom0 via ssh
> >>>3) Start kernel compilation on dom0 (make bzImage && make modules)
> >>>4) Wait some minutes and pv_ops dom0 kernel BUGs
> >>>
> >>>So no other domains has been or is running when this happens..
> >>>
> >>>I'll try disabling CONFIG_HIGHPTE now, and see if that makes any 
> >>>difference.
> >>>
> >>>      
> >>CONFIG_HIGHPTE=y and pv_ops dom0 survives up for maybe 30 mins, and then
> >>BUGs (during kernel compilation):
> >>http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-22-xen331-linux-2.6.29-rc8-bug-with-highpte.txt
> >>
> >>
> >>CONFIG_HIGHPTE=n and I get BUG during system startup when udev is started:
> >>http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte.txt
> >>
> >>Starting udev: BUG: unable to handle kernel paging request at 70007823
> >>IP: [<e30ce245>] pdc_common_ops+0x171/0xfffffcfe [sata_promise]
> >>*pdpt = 000000005f781001 
> >>Oops: 0002 [#1] SMP 
> >>
> >>So yeah..  with CONFIG_HIGHPTE=n it seems to happen when sata_promise is 
> >>loaded.. What should I try next? 
> >>    
> >
> >Actually it's not only sata_promise. I tried 2 more times with the
> >CONFIG_HIGHPTE=n pv_ops dom0 kernel:
> >
> >http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte.txt
> >http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte-2.txt
> >http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte-3.txt
> >
> >BUG: unable to handle kernel paging request at a536462c
> >IP: [<e30f4278>] classes+0x688/0xfffffa30 [parport]
> >*pdpt = 000000005f759001 
> >Oops: 0002 [#1] SMP 
> >  
> 
> Hm, OK.  Something is clearly drastically amiss.  I'll try to repro.
> 

Actually it seems CONFIG_HIGHPTE=n kernel fails also on baremetal:
http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-24-baremetal-2.6.29-rc8-bug-no-highpte.txt

Starting udev: invalid opcode: 0000 [#1] SMP 

Summary:
CONFIG_HIGHPTE=n: both dom0 and baremetal fail during system startup when udev is started
CONFIG_HIGHPTE=y: baremetal works OK, dom0 fails with BUG after around 30 mins of kernel compilation

-- Pasi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.29-rc8 pv_ops dom0 BUG / unable to handle kernel paging request
  2009-03-22 11:51                       ` Pasi Kärkkäinen
@ 2009-03-22 17:04                         ` Pasi Kärkkäinen
  2009-03-22 20:40                           ` Pasi Kärkkäinen
  0 siblings, 1 reply; 18+ messages in thread
From: Pasi Kärkkäinen @ 2009-03-22 17:04 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen-devel

On Sun, Mar 22, 2009 at 01:51:51PM +0200, Pasi Kärkkäinen wrote:
> On Sat, Mar 21, 2009 at 09:28:55PM -0700, Jeremy Fitzhardinge wrote:
> > Pasi Kärkkäinen wrote:
> > >On Sun, Mar 22, 2009 at 12:50:31AM +0200, Pasi Kärkkäinen wrote:
> > >  
> > >>On Sat, Mar 21, 2009 at 10:16:52PM +0200, Pasi Kärkkäinen wrote:
> > >>    
> > >>>>Also, do you see this problem before you've started any other domains?  
> > >>>>Or does it only happen once you've run a domU (or only while a domU is 
> > >>>>running)?
> > >>>>
> > >>>>        
> > >>>I'm not running any other domains.. Only dom0 is running.
> > >>>
> > >>>Steps to reproduce this BUG on my pv_ops dom0 testbox:
> > >>>
> > >>>1) Reboot the box to pv_ops dom0 kernel
> > >>>2) Login to dom0 via ssh
> > >>>3) Start kernel compilation on dom0 (make bzImage && make modules)
> > >>>4) Wait some minutes and pv_ops dom0 kernel BUGs
> > >>>
> > >>>So no other domains has been or is running when this happens..
> > >>>
> > >>>I'll try disabling CONFIG_HIGHPTE now, and see if that makes any 
> > >>>difference.
> > >>>
> > >>>      
> > >>CONFIG_HIGHPTE=y and pv_ops dom0 survives up for maybe 30 mins, and then
> > >>BUGs (during kernel compilation):
> > >>http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-22-xen331-linux-2.6.29-rc8-bug-with-highpte.txt
> > >>
> > >>
> > >>CONFIG_HIGHPTE=n and I get BUG during system startup when udev is started:
> > >>http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte.txt
> > >>
> > >>Starting udev: BUG: unable to handle kernel paging request at 70007823
> > >>IP: [<e30ce245>] pdc_common_ops+0x171/0xfffffcfe [sata_promise]
> > >>*pdpt = 000000005f781001 
> > >>Oops: 0002 [#1] SMP 
> > >>
> > >>So yeah..  with CONFIG_HIGHPTE=n it seems to happen when sata_promise is 
> > >>loaded.. What should I try next? 
> > >>    
> > >
> > >Actually it's not only sata_promise. I tried 2 more times with the
> > >CONFIG_HIGHPTE=n pv_ops dom0 kernel:
> > >
> > >http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte.txt
> > >http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte-2.txt
> > >http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte-3.txt
> > >
> > >BUG: unable to handle kernel paging request at a536462c
> > >IP: [<e30f4278>] classes+0x688/0xfffffa30 [parport]
> > >*pdpt = 000000005f759001 
> > >Oops: 0002 [#1] SMP 
> > >  
> > 
> > Hm, OK.  Something is clearly drastically amiss.  I'll try to repro.
> > 
> 
> Actually it seems CONFIG_HIGHPTE=n kernel fails also on baremetal:
> http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-24-baremetal-2.6.29-rc8-bug-no-highpte.txt
> 
> Starting udev: invalid opcode: 0000 [#1] SMP 
> 
> Summary:
> CONFIG_HIGHPTE=n: both dom0 and baremetal fail during system startup when udev is started
> CONFIG_HIGHPTE=y: baremetal works OK, dom0 fails with BUG after around 30 mins of kernel compilation
> 

Please ignore this summary, there was something wrong with my kernel builds or
something. 

I'll post new summary soon when I'm finished with testing. 

-- Pasi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.29-rc8 pv_ops dom0 BUG / unable to handle kernel paging request
  2009-03-22 17:04                         ` Pasi Kärkkäinen
@ 2009-03-22 20:40                           ` Pasi Kärkkäinen
  2009-03-22 21:21                             ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 18+ messages in thread
From: Pasi Kärkkäinen @ 2009-03-22 20:40 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen-devel

On Sun, Mar 22, 2009 at 07:04:23PM +0200, Pasi Kärkkäinen wrote:
> On Sun, Mar 22, 2009 at 01:51:51PM +0200, Pasi Kärkkäinen wrote:
> > On Sat, Mar 21, 2009 at 09:28:55PM -0700, Jeremy Fitzhardinge wrote:
> > > Pasi Kärkkäinen wrote:
> > > >On Sun, Mar 22, 2009 at 12:50:31AM +0200, Pasi Kärkkäinen wrote:
> > > >  
> > > >>On Sat, Mar 21, 2009 at 10:16:52PM +0200, Pasi Kärkkäinen wrote:
> > > >>    
> > > >>>>Also, do you see this problem before you've started any other domains?  
> > > >>>>Or does it only happen once you've run a domU (or only while a domU is 
> > > >>>>running)?
> > > >>>>
> > > >>>>        
> > > >>>I'm not running any other domains.. Only dom0 is running.
> > > >>>
> > > >>>Steps to reproduce this BUG on my pv_ops dom0 testbox:
> > > >>>
> > > >>>1) Reboot the box to pv_ops dom0 kernel
> > > >>>2) Login to dom0 via ssh
> > > >>>3) Start kernel compilation on dom0 (make bzImage && make modules)
> > > >>>4) Wait some minutes and pv_ops dom0 kernel BUGs
> > > >>>
> > > >>>So no other domains has been or is running when this happens..
> > > >>>
> > > >>>I'll try disabling CONFIG_HIGHPTE now, and see if that makes any 
> > > >>>difference.
> > > >>>
> > > >>>      
> > > >>CONFIG_HIGHPTE=y and pv_ops dom0 survives up for maybe 30 mins, and then
> > > >>BUGs (during kernel compilation):
> > > >>http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-22-xen331-linux-2.6.29-rc8-bug-with-highpte.txt
> > > >>
> > > >>
> > > >>CONFIG_HIGHPTE=n and I get BUG during system startup when udev is started:
> > > >>http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte.txt
> > > >>
> > > >>Starting udev: BUG: unable to handle kernel paging request at 70007823
> > > >>IP: [<e30ce245>] pdc_common_ops+0x171/0xfffffcfe [sata_promise]
> > > >>*pdpt = 000000005f781001 
> > > >>Oops: 0002 [#1] SMP 
> > > >>
> > > >>So yeah..  with CONFIG_HIGHPTE=n it seems to happen when sata_promise is 
> > > >>loaded.. What should I try next? 
> > > >>    
> > > >
> > > >Actually it's not only sata_promise. I tried 2 more times with the
> > > >CONFIG_HIGHPTE=n pv_ops dom0 kernel:
> > > >
> > > >http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte.txt
> > > >http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte-2.txt
> > > >http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte-3.txt
> > > >
> > > >BUG: unable to handle kernel paging request at a536462c
> > > >IP: [<e30f4278>] classes+0x688/0xfffffa30 [parport]
> > > >*pdpt = 000000005f759001 
> > > >Oops: 0002 [#1] SMP 
> > > >  
> > > 
> > > Hm, OK.  Something is clearly drastically amiss.  I'll try to repro.
> > > 
> > 
> > Actually it seems CONFIG_HIGHPTE=n kernel fails also on baremetal:
> > http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-24-baremetal-2.6.29-rc8-bug-no-highpte.txt
> > 
> > Starting udev: invalid opcode: 0000 [#1] SMP 
> > 
> > Summary:
> > CONFIG_HIGHPTE=n: both dom0 and baremetal fail during system startup when udev is started
> > CONFIG_HIGHPTE=y: baremetal works OK, dom0 fails with BUG after around 30 mins of kernel compilation
> > 
> 
> Please ignore this summary, there was something wrong with my kernel builds or
> something. 
> 
> I'll post new summary soon when I'm finished with testing. 
> 

Ok, I did new fresh kernel+modules builds and re-tested everything.

New summary:
CONFIG_HIGHPTE=n: both dom0 and baremetal work OK, both survive kernel compilation.
CONFIG_HIGHPTE=y: baremetal works OK and survives kernel compilation, but dom0 fails with BUG after around 20-30 mins of kernel compilation

latest BUG with CONFIG_HIGHPTE=y with dom0 kernel:
http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-25-xen331-linux-2.6.29-rc8-bug-with-highpte.txt

(XEN) mm.c:2006:d0 Bad type (saw 28000001 != exp e0000000) for mfn 57f2f (pfn 2892f)
(XEN) mm.c:707:d0 Error getting mfn 57f2f (pfn 2892f) from L1 entry 0000000057f2f063 for dom0
(XEN) mm.c:3640:d0 ptwr_emulate: could not get_page_from_l1e()
BUG: unable to handle kernel paging request at c020bc80
IP: [<c0405d23>] xen_set_pte+0x8c/0x96
*pdpt = 000000003c984001 
Oops: 0003 [#1] SMP 

I tested all combinations multiple times now, and the results were consistent. 

Another BUG with dom0 with CONFIG_HIGHPTE=y:
http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-25-xen331-linux-2.6.29-rc8-bug-with-highpte-2.txt

(XEN) mm.c:2006:d0 Bad type (saw 28000001 != exp e0000000) for mfn 4c1e1 (pfn 347e1)
(XEN) mm.c:707:d0 Error getting mfn 4c1e1 (pfn 347e1) from L1 entry 000000004c1e1063 for dom0
(XEN) mm.c:3640:d0 ptwr_emulate: could not get_page_from_l1e()
BUG: unable to handle kernel paging request at c020bc80
IP: [<c0405d23>] xen_set_pte+0x8c/0x96
*pdpt = 000000003c984001 
Oops: 0003 [#1] SMP 

-- Pasi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: 2.6.29-rc8 pv_ops dom0 BUG / unable to handle kernel paging request
  2009-03-22 20:40                           ` Pasi Kärkkäinen
@ 2009-03-22 21:21                             ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 18+ messages in thread
From: Jeremy Fitzhardinge @ 2009-03-22 21:21 UTC (permalink / raw)
  To: Pasi Kärkkäinen; +Cc: xen-devel

Pasi Kärkkäinen wrote:
> On Sun, Mar 22, 2009 at 07:04:23PM +0200, Pasi Kärkkäinen wrote:
>   
>> On Sun, Mar 22, 2009 at 01:51:51PM +0200, Pasi Kärkkäinen wrote:
>>     
>>> On Sat, Mar 21, 2009 at 09:28:55PM -0700, Jeremy Fitzhardinge wrote:
>>>       
>>>> Pasi Kärkkäinen wrote:
>>>>         
>>>>> On Sun, Mar 22, 2009 at 12:50:31AM +0200, Pasi Kärkkäinen wrote:
>>>>>  
>>>>>           
>>>>>> On Sat, Mar 21, 2009 at 10:16:52PM +0200, Pasi Kärkkäinen wrote:
>>>>>>    
>>>>>>             
>>>>>>>> Also, do you see this problem before you've started any other domains?  
>>>>>>>> Or does it only happen once you've run a domU (or only while a domU is 
>>>>>>>> running)?
>>>>>>>>
>>>>>>>>        
>>>>>>>>                 
>>>>>>> I'm not running any other domains.. Only dom0 is running.
>>>>>>>
>>>>>>> Steps to reproduce this BUG on my pv_ops dom0 testbox:
>>>>>>>
>>>>>>> 1) Reboot the box to pv_ops dom0 kernel
>>>>>>> 2) Login to dom0 via ssh
>>>>>>> 3) Start kernel compilation on dom0 (make bzImage && make modules)
>>>>>>> 4) Wait some minutes and pv_ops dom0 kernel BUGs
>>>>>>>
>>>>>>> So no other domains has been or is running when this happens..
>>>>>>>
>>>>>>> I'll try disabling CONFIG_HIGHPTE now, and see if that makes any 
>>>>>>> difference.
>>>>>>>
>>>>>>>      
>>>>>>>               
>>>>>> CONFIG_HIGHPTE=y and pv_ops dom0 survives up for maybe 30 mins, and then
>>>>>> BUGs (during kernel compilation):
>>>>>> http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-22-xen331-linux-2.6.29-rc8-bug-with-highpte.txt
>>>>>>
>>>>>>
>>>>>> CONFIG_HIGHPTE=n and I get BUG during system startup when udev is started:
>>>>>> http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte.txt
>>>>>>
>>>>>> Starting udev: BUG: unable to handle kernel paging request at 70007823
>>>>>> IP: [<e30ce245>] pdc_common_ops+0x171/0xfffffcfe [sata_promise]
>>>>>> *pdpt = 000000005f781001 
>>>>>> Oops: 0002 [#1] SMP 
>>>>>>
>>>>>> So yeah..  with CONFIG_HIGHPTE=n it seems to happen when sata_promise is 
>>>>>> loaded.. What should I try next? 
>>>>>>    
>>>>>>             
>>>>> Actually it's not only sata_promise. I tried 2 more times with the
>>>>> CONFIG_HIGHPTE=n pv_ops dom0 kernel:
>>>>>
>>>>> http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte.txt
>>>>> http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte-2.txt
>>>>> http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-23-xen331-linux-2.6.29-rc8-bug-no-highpte-3.txt
>>>>>
>>>>> BUG: unable to handle kernel paging request at a536462c
>>>>> IP: [<e30f4278>] classes+0x688/0xfffffa30 [parport]
>>>>> *pdpt = 000000005f759001 
>>>>> Oops: 0002 [#1] SMP 
>>>>>  
>>>>>           
>>>> Hm, OK.  Something is clearly drastically amiss.  I'll try to repro.
>>>>
>>>>         
>>> Actually it seems CONFIG_HIGHPTE=n kernel fails also on baremetal:
>>> http://pasik.reaktio.net/xen/pv_ops-dom0-debug/pv_ops-dom0-bootlog-24-baremetal-2.6.29-rc8-bug-no-highpte.txt
>>>
>>> Starting udev: invalid opcode: 0000 [#1] SMP 
>>>
>>> Summary:
>>> CONFIG_HIGHPTE=n: both dom0 and baremetal fail during system startup when udev is started
>>> CONFIG_HIGHPTE=y: baremetal works OK, dom0 fails with BUG after around 30 mins of kernel compilation
>>>
>>>       
>> Please ignore this summary, there was something wrong with my kernel builds or
>> something. 
>>
>> I'll post new summary soon when I'm finished with testing. 
>>
>>     
>
> Ok, I did new fresh kernel+modules builds and re-tested everything.
>
> New summary:
> CONFIG_HIGHPTE=n: both dom0 and baremetal work OK, both survive kernel compilation.
> CONFIG_HIGHPTE=y: baremetal works OK and survives kernel compilation, but dom0 fails with BUG after around 20-30 mins of kernel compilation
>   

Thanks for getting a consistent test result; the other reports looked, 
frankly, scary and I wouldn't want to be on that wild goose chase.

These ones look much more tractable, though I don't really have a theory 
for them.  I'll have a look next week sometime.

    J

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2009-03-22 21:21 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-07 17:58 2.6.29-rc7 pv_ops dom0 BUG / unable to handle kernel paging request Pasi Kärkkäinen
2009-03-08  6:09 ` Jeremy Fitzhardinge
2009-03-08 11:54   ` Pasi Kärkkäinen
2009-03-11 20:52     ` Pasi Kärkkäinen
2009-03-11 21:26       ` Pasi Kärkkäinen
2009-03-11 23:40         ` Jeremy Fitzhardinge
2009-03-12  8:32           ` Pasi Kärkkäinen
2009-03-20 18:10             ` Jeremy Fitzhardinge
2009-03-20 19:04               ` dom0_mem with 2.6.29-rc8 Boris Derzhavets
2009-03-20 19:35                 ` Jeremy Fitzhardinge
2009-03-21 20:16               ` 2.6.29-rc7 pv_ops dom0 BUG / unable to handle kernel paging request Pasi Kärkkäinen
2009-03-21 22:50                 ` Pasi Kärkkäinen
2009-03-21 23:13                   ` 2.6.29-rc8 " Pasi Kärkkäinen
2009-03-22  4:28                     ` Jeremy Fitzhardinge
2009-03-22 11:51                       ` Pasi Kärkkäinen
2009-03-22 17:04                         ` Pasi Kärkkäinen
2009-03-22 20:40                           ` Pasi Kärkkäinen
2009-03-22 21:21                             ` Jeremy Fitzhardinge

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.