All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Crash on boot with 2.6.37-rc8-git3
@ 2011-01-07 20:34 M A Young
  2011-01-07 21:23 ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 31+ messages in thread
From: M A Young @ 2011-01-07 20:34 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1923 bytes --]

On Fri, 7 Jan 2011, Konrad Rzeszutek Wilk wrote:
>> BUG unable to handle kernel NULL pointer dereference at
>> IP: [<ffffffff81b69b92>] setup_node_bootmem+0x16b/0x199

> Hmmm, I did see something similar to this in 2.6.37-rc1, but we fixed
> that quickly. It was triggered by having 4GB of memory or so and
> the work-around was to use dom0_mem=max:2GB.
> 
> Can you send the photo? Maybe the calleer stack will shed some light.

Here are two photos of the output at different times. The context is

    0xffffffff81b69b6d <setup_node_bootmem+326>:
     callq  0xffffffff81475ec9 <printk>
    0xffffffff81b69b72 <setup_node_bootmem+331>:	movslq %ebx,%rdx
    0xffffffff81b69b75 <setup_node_bootmem+334>:	xor    %eax,%eax
    0xffffffff81b69b77 <setup_node_bootmem+336>:	mov    $0x4fc0,%ecx
    0xffffffff81b69b7c <setup_node_bootmem+341>:
     mov    -0x7e4cb750(,%rdx,8),%rsi
    0xffffffff81b69b84 <setup_node_bootmem+349>:	shr    $0xc,%r13
    0xffffffff81b69b88 <setup_node_bootmem+353>:	shr    $0xc,%r12
    0xffffffff81b69b8c <setup_node_bootmem+357>:	sub    %r13,%r12
    0xffffffff81b69b8f <setup_node_bootmem+360>:	mov    %rsi,%rdi
    0xffffffff81b69b92 <setup_node_bootmem+363>:	rep stos %eax,%es:(%rdi)
    0xffffffff81b69b94 <setup_node_bootmem+365>:	mov    %ebx,%edi
    0xffffffff81b69b96 <setup_node_bootmem+367>:
     mov    -0x7e4cb750(,%rdx,8),%rax

which is somewhere around line 224 in arch/x86/mm/numa_64.c

         if (nid != nodeid)
                 printk(KERN_INFO "    NODE_DATA(%d) on node %d\n", nodeid, 
nid);

         memset(NODE_DATA(nodeid), 0, sizeof(pg_data_t));
         NODE_DATA(nodeid)->node_id = nodeid;
         NODE_DATA(nodeid)->node_start_pfn = start_pfn;
         NODE_DATA(nodeid)->node_spanned_pages = last_pfn - start_pfn;

         node_set_online(nodeid);

I do have 4GB of memory and the system does boot if I add dom0_mem=max:2GB to 
the xen boot line.

 	Michael Young

[-- Attachment #2: Type: IMAGE/JPEG, Size: 454438 bytes --]

[-- Attachment #3: Type: IMAGE/JPEG, Size: 452331 bytes --]

[-- Attachment #4: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-07 20:34 Crash on boot with 2.6.37-rc8-git3 M A Young
@ 2011-01-07 21:23 ` Konrad Rzeszutek Wilk
  2011-01-08  0:10   ` M A Young
  0 siblings, 1 reply; 31+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-01-07 21:23 UTC (permalink / raw)
  To: M A Young; +Cc: xen-devel

On Fri, Jan 07, 2011 at 08:34:43PM +0000, M A Young wrote:
> On Fri, 7 Jan 2011, Konrad Rzeszutek Wilk wrote:
> >>BUG unable to handle kernel NULL pointer dereference at
> >>IP: [<ffffffff81b69b92>] setup_node_bootmem+0x16b/0x199
> 
> >Hmmm, I did see something similar to this in 2.6.37-rc1, but we fixed
> >that quickly. It was triggered by having 4GB of memory or so and
> >the work-around was to use dom0_mem=max:2GB.
> >
> >Can you send the photo? Maybe the calleer stack will shed some light.
> 
> Here are two photos of the output at different times. The context is
> 
>    0xffffffff81b69b6d <setup_node_bootmem+326>:
>     callq  0xffffffff81475ec9 <printk>
>    0xffffffff81b69b72 <setup_node_bootmem+331>:	movslq %ebx,%rdx
>    0xffffffff81b69b75 <setup_node_bootmem+334>:	xor    %eax,%eax
>    0xffffffff81b69b77 <setup_node_bootmem+336>:	mov    $0x4fc0,%ecx
>    0xffffffff81b69b7c <setup_node_bootmem+341>:
>     mov    -0x7e4cb750(,%rdx,8),%rsi
>    0xffffffff81b69b84 <setup_node_bootmem+349>:	shr    $0xc,%r13
>    0xffffffff81b69b88 <setup_node_bootmem+353>:	shr    $0xc,%r12
>    0xffffffff81b69b8c <setup_node_bootmem+357>:	sub    %r13,%r12
>    0xffffffff81b69b8f <setup_node_bootmem+360>:	mov    %rsi,%rdi
>    0xffffffff81b69b92 <setup_node_bootmem+363>:	rep stos %eax,%es:(%rdi)

That looks like:

	memset(NODE_DATA(nodeid), 0, sizeof(pg_data_t));

>From the photo, %eax is zero, and this is perfect code for copying values in.


>    0xffffffff81b69b94 <setup_node_bootmem+365>:	mov    %ebx,%edi
>    0xffffffff81b69b96 <setup_node_bootmem+367>:
>     mov    -0x7e4cb750(,%rdx,8),%rax
> 
> which is somewhere around line 224 in arch/x86/mm/numa_64.c
> 
>         if (nid != nodeid)
>                 printk(KERN_INFO "    NODE_DATA(%d) on node %d\n",
> nodeid, nid);

Can you make sure that 419db274bed4269f475a8e78cbe9c917192cfe8b is in? That
is the patch that fixed this issue last time.

However .. the more I look at the code the less it seems to be that and
that is the last fix in that file.

Do you see any messages about 'Cannot find 20 bytes in node X' (where X
I think is 0)?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-07 21:23 ` Konrad Rzeszutek Wilk
@ 2011-01-08  0:10   ` M A Young
  2011-01-10 18:42     ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 31+ messages in thread
From: M A Young @ 2011-01-08  0:10 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

On Fri, 7 Jan 2011, Konrad Rzeszutek Wilk wrote:

> Can you make sure that 419db274bed4269f475a8e78cbe9c917192cfe8b is in? That
> is the patch that fixed this issue last time.

Yes it is.

> Do you see any messages about 'Cannot find 20 bytes in node X' (where X
> I think is 0)?

I haven't spotted any such message.

 	Michael Young

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-08  0:10   ` M A Young
@ 2011-01-10 18:42     ` Konrad Rzeszutek Wilk
  2011-01-10 21:43       ` M A Young
                         ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-01-10 18:42 UTC (permalink / raw)
  To: M A Young; +Cc: xen-devel

> >Do you see any messages about 'Cannot find 20 bytes in node X' (where X
> >I think is 0)?
> 
> I haven't spotted any such message.

Try fiddling with the dom0_mem.. to see at what point it starts failing. Is
this happening only on this machine or do you see it on other boxes too?

Your E820 looks as so:
 BIOS-e820: 0000000000000000 - 000000000009f000 (usable)
 BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000df66d800 (usable)
 BIOS-e820: 00000000df66d800 - 00000000e0000000 (reserved)
 BIOS-e820: 00000000f8000000 - 00000000fc000000 (reserved)
 BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
 BIOS-e820: 00000000fed18000 - 00000000fed1c000 (reserved)
 BIOS-e820: 00000000fed20000 - 00000000fed90000 (reserved)
 BIOS-e820: 00000000feda0000 - 00000000feda6000 (reserved)
 BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
 BIOS-e820: 00000000ffe00000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000000120000000 (usable)

Which looks completly normal.. I am really at loss here. You could
also sprinkle printk's around that code (or xen_raw_printk and inhibit
the Linux kernel console output - that way you would only see the Xen
and output from xen_raw_printk).

Let me bootup 2.6.37 on a 4GB machine just to see if I am seeing this.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-10 18:42     ` Konrad Rzeszutek Wilk
@ 2011-01-10 21:43       ` M A Young
  2011-01-16 20:48       ` M A Young
  2011-01-18  0:52       ` M A Young
  2 siblings, 0 replies; 31+ messages in thread
From: M A Young @ 2011-01-10 21:43 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

On Mon, 10 Jan 2011, Konrad Rzeszutek Wilk wrote:

> Try fiddling with the dom0_mem.. to see at what point it starts failing. Is
> this happening only on this machine or do you see it on other boxes too?

dom0_mem=max:3574MB boots, dom0_mem=max:3575MB doesn't. I haven't tried it 
on other boxes yet.

> Which looks completly normal.. I am really at loss here. You could
> also sprinkle printk's around that code (or xen_raw_printk and inhibit
> the Linux kernel console output - that way you would only see the Xen
> and output from xen_raw_printk).

I will think about where the printk's should go, but probably not tonight.

 	Michael Young

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-10 18:42     ` Konrad Rzeszutek Wilk
  2011-01-10 21:43       ` M A Young
@ 2011-01-16 20:48       ` M A Young
  2011-01-16 20:56         ` Keir Fraser
  2011-01-18  0:52       ` M A Young
  2 siblings, 1 reply; 31+ messages in thread
From: M A Young @ 2011-01-16 20:48 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

On Mon, 10 Jan 2011, Konrad Rzeszutek Wilk wrote:

> Your E820 looks as so:
> BIOS-e820: 0000000000000000 - 000000000009f000 (usable)
> BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved)
> BIOS-e820: 0000000000100000 - 00000000df66d800 (usable)
> BIOS-e820: 00000000df66d800 - 00000000e0000000 (reserved)
> BIOS-e820: 00000000f8000000 - 00000000fc000000 (reserved)
> BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
> BIOS-e820: 00000000fed18000 - 00000000fed1c000 (reserved)
> BIOS-e820: 00000000fed20000 - 00000000fed90000 (reserved)
> BIOS-e820: 00000000feda0000 - 00000000feda6000 (reserved)
> BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
> BIOS-e820: 00000000ffe00000 - 0000000100000000 (reserved)
> BIOS-e820: 0000000100000000 - 0000000120000000 (usable)
>
> Which looks completly normal.. I am really at loss here.

I have looked at this again and I am worried by the last section, which is 
a chunk from 4GB to 4.5GB. The problem is that I only have 4GB. My tests 
show that dom0_mem=max:3574MB boots, dom0_mem=max:3575MB doesn't. The 
first two "usable" chunks add up to a few KB over 3574MB so the problems 
come when it tries to use the final "usable" chunk which I interpret as 
being beyond the memory I have.

3574MB is a bit less than 3.5GB so I would guess that the final chunk is 
trying to make up the memory to 4GB. There are also gaps in these memory 
pieces which add up to about 445MB. Hence I think there are some issues 
with the memory allocation mechanism.

 	Michael Young

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-16 20:48       ` M A Young
@ 2011-01-16 20:56         ` Keir Fraser
  0 siblings, 0 replies; 31+ messages in thread
From: Keir Fraser @ 2011-01-16 20:56 UTC (permalink / raw)
  To: M A Young, Konrad Rzeszutek Wilk; +Cc: xen-devel

On 16/01/2011 20:48, "M A Young" <m.a.young@durham.ac.uk> wrote:

>> BIOS-e820: 0000000000000000 - 000000000009f000 (usable)
>> BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved)
>> BIOS-e820: 0000000000100000 - 00000000df66d800 (usable)
>> BIOS-e820: 00000000df66d800 - 00000000e0000000 (reserved)
>> BIOS-e820: 00000000f8000000 - 00000000fc000000 (reserved)
>> BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
>> BIOS-e820: 00000000fed18000 - 00000000fed1c000 (reserved)
>> BIOS-e820: 00000000fed20000 - 00000000fed90000 (reserved)
>> BIOS-e820: 00000000feda0000 - 00000000feda6000 (reserved)
>> BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
>> BIOS-e820: 00000000ffe00000 - 0000000100000000 (reserved)
>> BIOS-e820: 0000000100000000 - 0000000120000000 (usable)
>> 
>> Which looks completly normal.. I am really at loss here.
> 
> I have looked at this again and I am worried by the last section, which is
> a chunk from 4GB to 4.5GB. The problem is that I only have 4GB. My tests
> show that dom0_mem=max:3574MB boots, dom0_mem=max:3575MB doesn't. The
> first two "usable" chunks add up to a few KB over 3574MB so the problems
> come when it tries to use the final "usable" chunk which I interpret as
> being beyond the memory I have.
> 
> 3574MB is a bit less than 3.5GB so I would guess that the final chunk is
> trying to make up the memory to 4GB. There are also gaps in these memory
> pieces which add up to about 445MB. Hence I think there are some issues
> with the memory allocation mechanism.

Device memory gets mapped just below 4GB, so the last piece of your RAM gets
re-mapped above 4GB by your BIOS, so that it can still be accessed. If you
add up the size of all the usable regions in the list above, it will sum to
a bit less than 4GB.

The bug will be something in the kernel code that can't handle physical
addresses wider than 32 bits (i.e., physical addresses 4GB and above).

 -- Keir

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-10 18:42     ` Konrad Rzeszutek Wilk
  2011-01-10 21:43       ` M A Young
  2011-01-16 20:48       ` M A Young
@ 2011-01-18  0:52       ` M A Young
  2011-01-19 22:54         ` M A Young
  2 siblings, 1 reply; 31+ messages in thread
From: M A Young @ 2011-01-18  0:52 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel



On Mon, 10 Jan 2011, Konrad Rzeszutek Wilk wrote:

>>> Do you see any messages about 'Cannot find 20 bytes in node X' (where X
>>> I think is 0)?
>>
>> I haven't spotted any such message.
>
> Try fiddling with the dom0_mem.. to see at what point it starts failing. Is
> this happening only on this machine or do you see it on other boxes too?
>
> Your E820 looks as so:
> BIOS-e820: 0000000000000000 - 000000000009f000 (usable)
> BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved)
> BIOS-e820: 0000000000100000 - 00000000df66d800 (usable)
> BIOS-e820: 00000000df66d800 - 00000000e0000000 (reserved)
> BIOS-e820: 00000000f8000000 - 00000000fc000000 (reserved)
> BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
> BIOS-e820: 00000000fed18000 - 00000000fed1c000 (reserved)
> BIOS-e820: 00000000fed20000 - 00000000fed90000 (reserved)
> BIOS-e820: 00000000feda0000 - 00000000feda6000 (reserved)
> BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
> BIOS-e820: 00000000ffe00000 - 0000000100000000 (reserved)
> BIOS-e820: 0000000100000000 - 0000000120000000 (usable)
>
> Which looks completly normal.. I am really at loss here. You could
> also sprinkle printk's around that code (or xen_raw_printk and inhibit
> the Linux kernel console output - that way you would only see the Xen
> and output from xen_raw_printk).
>
> Let me bootup 2.6.37 on a 4GB machine just to see if I am seeing this.

My next theory is that the issue is that the system is an alignment issue. 
The NODE DATA is put in the range 00000000df659800 to 00000000df66d7ff 
(the top end of the second "usable" chunk) and the problem come when it 
tries to write to the final 2K piece (00000000df66d000 to 
00000000df66d800 - 00000000df66d000 occurs on the stack) which hasn't been 
initialized properly because it isn't a 4K piece.
Does this sound plausible?

 	Michael Young

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-18  0:52       ` M A Young
@ 2011-01-19 22:54         ` M A Young
  2011-01-20 19:24           ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 31+ messages in thread
From: M A Young @ 2011-01-19 22:54 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

On Tue, 18 Jan 2011, M A Young wrote:

> My next theory is that the issue is that the system is an alignment issue. 
> The NODE DATA is put in the range 00000000df659800 to 00000000df66d7ff (the 
> top end of the second "usable" chunk) and the problem come when it tries to 
> write to the final 2K piece (00000000df66d000 to 00000000df66d800 - 
> 00000000df66d000 occurs on the stack) which hasn't been initialized properly 
> because it isn't a 4K piece.
> Does this sound plausible?

Further experiments confirm that it is this 2K piece causing the problem - 
if I reserve the 2K chunk in the same was that NODE DATA is reserved 
(though without zeroing it) the system boots, if I reduce this to 
reserving only 1K then it doesn't.

 	Michael Young

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-19 22:54         ` M A Young
@ 2011-01-20 19:24           ` Konrad Rzeszutek Wilk
  2011-01-20 22:39             ` M A Young
  0 siblings, 1 reply; 31+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-01-20 19:24 UTC (permalink / raw)
  To: M A Young; +Cc: xen-devel

On Wed, Jan 19, 2011 at 10:54:00PM +0000, M A Young wrote:
> On Tue, 18 Jan 2011, M A Young wrote:
> 
> >My next theory is that the issue is that the system is an
> >alignment issue. The NODE DATA is put in the range
> >00000000df659800 to 00000000df66d7ff (the top end of the second
> >"usable" chunk) and the problem come when it tries to write to the
> >final 2K piece (00000000df66d000 to 00000000df66d800 -
> >00000000df66d000 occurs on the stack) which hasn't been
> >initialized properly because it isn't a 4K piece.
> >Does this sound plausible?
> 
> Further experiments confirm that it is this 2K piece causing the
> problem - if I reserve the 2K chunk in the same was that NODE DATA
> is reserved (though without zeroing it) the system boots, if I
> reduce this to reserving only 1K then it doesn't.

I think my math is off here. The reserve call is made on the
df659800 -> df66d7ff, that would be 20 pages of data. The last
PFN df66d is where it dies b/c there is no PTE entry set for it?

What happens if you fudge the code so it allocates those pages to be
page aligned. So df65a000->df66e000 ? We skip this way the region
df659800->df659fff and start on a new PFN (and pte).

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-20 19:24           ` Konrad Rzeszutek Wilk
@ 2011-01-20 22:39             ` M A Young
  2011-01-21 15:27               ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 31+ messages in thread
From: M A Young @ 2011-01-20 22:39 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

On Thu, 20 Jan 2011, Konrad Rzeszutek Wilk wrote:

> I think my math is off here. The reserve call is made on the
> df659800 -> df66d7ff, that would be 20 pages of data. The last
> PFN df66d is where it dies b/c there is no PTE entry set for it?
>
> What happens if you fudge the code so it allocates those pages to be
> page aligned. So df65a000->df66e000 ? We skip this way the region
> df659800->df659fff and start on a new PFN (and pte).

I get (though the photo isn't clear in places) df659000->df66cfff and it 
crashes at find_range_array+0x4d/0x56 which traces back to the 
call of memblock_find_dma_reserve from setup_arch in 
arch/x86/kernel/setup.c . So it still crashes, but at a slightly later 
stage.

 	Michael Young

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-20 22:39             ` M A Young
@ 2011-01-21 15:27               ` Konrad Rzeszutek Wilk
  2011-01-21 21:43                 ` M A Young
  0 siblings, 1 reply; 31+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-01-21 15:27 UTC (permalink / raw)
  To: M A Young; +Cc: xen-devel

On Thu, Jan 20, 2011 at 10:39:17PM +0000, M A Young wrote:
> On Thu, 20 Jan 2011, Konrad Rzeszutek Wilk wrote:
> 
> >I think my math is off here. The reserve call is made on the
> >df659800 -> df66d7ff, that would be 20 pages of data. The last
> >PFN df66d is where it dies b/c there is no PTE entry set for it?
> >
> >What happens if you fudge the code so it allocates those pages to be
> >page aligned. So df65a000->df66e000 ? We skip this way the region
> >df659800->df659fff and start on a new PFN (and pte).
> 
> I get (though the photo isn't clear in places) df659000->df66cfff
> and it crashes at find_range_array+0x4d/0x56 which traces back to
> the call of memblock_find_dma_reserve from setup_arch in
> arch/x86/kernel/setup.c . So it still crashes, but at a slightly
> later stage.

Ok, so we just pass the back so to say to the next user of that PFN.

We should find out why that PTE is not being setup.... And I think
this might be a missing entry in the MFN (thanks to Stefan Bader
finding a bug there).  Looking at your E820:

[    0.000000]  Xen: 0000000000100000 - 000000003b0e2000 (usable)

Your memory ends a 3b0e, which is not on a nice page boundary.
Can you try this patch (you will need to re-gigger as in 2.6.38-rc1
the p2m code moved out of xen/mmu.c to xen/p2m.c):

https://patchwork.kernel.org/patch/492011/

BTW, You are doing a great detective work here. Thanks for
being willing to dig in this.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-21 15:27               ` Konrad Rzeszutek Wilk
@ 2011-01-21 21:43                 ` M A Young
  2011-01-24 14:14                   ` Konrad Rzeszutek Wilk
  2011-01-24 19:04                   ` Stefano Stabellini
  0 siblings, 2 replies; 31+ messages in thread
From: M A Young @ 2011-01-21 21:43 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

On Fri, 21 Jan 2011, Konrad Rzeszutek Wilk wrote:

> We should find out why that PTE is not being setup.... And I think
> this might be a missing entry in the MFN (thanks to Stefan Bader
> finding a bug there).  Looking at your E820:
>
> [    0.000000]  Xen: 0000000000100000 - 000000003b0e2000 (usable)

Mine is
[    0.000000]  Xen: 0000000000100000 - 00000000df66d800 (usable)

> Your memory ends a 3b0e, which is not on a nice page boundary.

Mine isn't on a page boundary at all!

> Can you try this patch (you will need to re-gigger as in 2.6.38-rc1
> the p2m code moved out of xen/mmu.c to xen/p2m.c):

It doesn't help, and crashes at the same place as the unaltered kernel. My 
problem may not be happening in the xen code at all. From the boot logs of 
one of my hack attempts that actually booted I have

[    0.000000] BIOS-provided physical RAM map:
[    0.000000]  Xen: 0000000000000000 - 000000000009f000 (usable)
[    0.000000]  Xen: 000000000009f000 - 0000000000100000 (reserved)
[    0.000000]  Xen: 0000000000100000 - 00000000df66d800 (usable)
[    0.000000]  Xen: 00000000df66d800 - 00000000e0000000 (reserved)
[    0.000000]  Xen: 00000000f8000000 - 00000000fc000000 (reserved)
[    0.000000]  Xen: 00000000fec00000 - 00000000fec10000 (reserved)
[    0.000000]  Xen: 00000000fed18000 - 00000000fed1c000 (reserved)
[    0.000000]  Xen: 00000000fed20000 - 00000000fed90000 (reserved)
[    0.000000]  Xen: 00000000feda0000 - 00000000feda6000 (reserved)
[    0.000000]  Xen: 00000000fee00000 - 00000000fee10000 (reserved)
[    0.000000]  Xen: 00000000ffe00000 - 0000000100000000 (reserved)
[    0.000000]  Xen: 0000000100000000 - 00000001342cb000 (usable)
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] DMI 2.4 present.
[    0.000000] No AGP bridge found
[    0.000000] last_pfn = 0x1342cb max_arch_pfn = 0x400000000
[    0.000000] last_pfn = 0xdf66d max_arch_pfn = 0x400000000
[    0.000000] init_memory_mapping: 0000000000000000-00000000df66d000
[    0.000000] init_memory_mapping: 0000000100000000-00000001342cb000

The last_pfn figure above is actually one more than the last pfn that is 
initialized and is obtained by right-shifting the start memory address 
plus the length of the memory piece. That is fine if the memory ends on a 
page boundary, but not if it doesn't because the partial page doesn't get 
a pfn. Thus it is available for early allocations such as the NODE DATA 
chunk. Xen goes for the memory chunk just below the 4GB mark and hits this 
region, bare metal (2.6.35) starts the NODE DATA at the 4GB mark and 
doesn't.

I am not sure if bare metal is clever enough not to try to use this 
partial page, or whether it could but misses it because of how it places 
the NODE_DATA (at the bottom end of a memory region rather than the top 
end).

 	Michael Young

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-21 21:43                 ` M A Young
@ 2011-01-24 14:14                   ` Konrad Rzeszutek Wilk
  2011-01-24 23:12                     ` M A Young
  2011-01-24 19:04                   ` Stefano Stabellini
  1 sibling, 1 reply; 31+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-01-24 14:14 UTC (permalink / raw)
  To: M A Young; +Cc: xen-devel

On Fri, Jan 21, 2011 at 09:43:34PM +0000, M A Young wrote:
> On Fri, 21 Jan 2011, Konrad Rzeszutek Wilk wrote:
> 
> >We should find out why that PTE is not being setup.... And I think
> >this might be a missing entry in the MFN (thanks to Stefan Bader
> >finding a bug there).  Looking at your E820:
> >
> >[    0.000000]  Xen: 0000000000100000 - 000000003b0e2000 (usable)
> 
> Mine is
> [    0.000000]  Xen: 0000000000100000 - 00000000df66d800 (usable)
> 
> >Your memory ends a 3b0e, which is not on a nice page boundary.
> 
> Mine isn't on a page boundary at all!

Whoa.
> 
> >Can you try this patch (you will need to re-gigger as in 2.6.38-rc1
> >the p2m code moved out of xen/mmu.c to xen/p2m.c):
> 
> It doesn't help, and crashes at the same place as the unaltered
> kernel. My problem may not be happening in the xen code at all. From
> the boot logs of one of my hack attempts that actually booted I have
> 
> [    0.000000] BIOS-provided physical RAM map:
> [    0.000000]  Xen: 0000000000000000 - 000000000009f000 (usable)
> [    0.000000]  Xen: 000000000009f000 - 0000000000100000 (reserved)
> [    0.000000]  Xen: 0000000000100000 - 00000000df66d800 (usable)
> [    0.000000]  Xen: 00000000df66d800 - 00000000e0000000 (reserved)
> [    0.000000]  Xen: 00000000f8000000 - 00000000fc000000 (reserved)
> [    0.000000]  Xen: 00000000fec00000 - 00000000fec10000 (reserved)
> [    0.000000]  Xen: 00000000fed18000 - 00000000fed1c000 (reserved)
> [    0.000000]  Xen: 00000000fed20000 - 00000000fed90000 (reserved)
> [    0.000000]  Xen: 00000000feda0000 - 00000000feda6000 (reserved)
> [    0.000000]  Xen: 00000000fee00000 - 00000000fee10000 (reserved)
> [    0.000000]  Xen: 00000000ffe00000 - 0000000100000000 (reserved)
> [    0.000000]  Xen: 0000000100000000 - 00000001342cb000 (usable)
> [    0.000000] NX (Execute Disable) protection: active
> [    0.000000] DMI 2.4 present.
> [    0.000000] No AGP bridge found
> [    0.000000] last_pfn = 0x1342cb max_arch_pfn = 0x400000000
> [    0.000000] last_pfn = 0xdf66d max_arch_pfn = 0x400000000
> [    0.000000] init_memory_mapping: 0000000000000000-00000000df66d000
> [    0.000000] init_memory_mapping: 0000000100000000-00000001342cb000
> 
> The last_pfn figure above is actually one more than the last pfn
> that is initialized and is obtained by right-shifting the start
> memory address plus the length of the memory piece. That is fine if
> the memory ends on a page boundary, but not if it doesn't because
> the partial page doesn't get a pfn. Thus it is available for early

We can fix how the E820 is done.
Look in arch/x86/xen/setup.c for 'xen_memory_setup' function.
Try to wrap make map[i].size be = map[i].szie & ~(PAGE_SIZE-1)
that should trim off the last 2048 bytes.

> allocations such as the NODE DATA chunk. Xen goes for the memory
> chunk just below the 4GB mark and hits this region, bare metal
> (2.6.35) starts the NODE DATA at the 4GB mark and doesn't.

That should be generic and hit both cases - but I think this got
fixed in 2.6.36-ish were going for the region right underneath
4GB is not done (don't remember the details, sadly).

> 
> I am not sure if bare metal is clever enough not to try to use this
> partial page, or whether it could but misses it because of how it
> places the NODE_DATA (at the bottom end of a memory region rather
> than the top end).

If you leave the instrumentation you placed in and add 'memblock=debug'
that should give you a good idea of how it does it?
> 
> 	Michael Young
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-21 21:43                 ` M A Young
  2011-01-24 14:14                   ` Konrad Rzeszutek Wilk
@ 2011-01-24 19:04                   ` Stefano Stabellini
  2011-01-25  0:22                     ` M A Young
  1 sibling, 1 reply; 31+ messages in thread
From: Stefano Stabellini @ 2011-01-24 19:04 UTC (permalink / raw)
  To: M A Young; +Cc: xen-devel, Konrad Rzeszutek Wilk

I have a work-in-progress patch that fixes a booting issue on one of my
testboxes. Could you please give it a try, passing dom0_mem=700M to the
Xen command line?



diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 947f42a..ebc0221 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -291,10 +291,23 @@ unsigned long __init_refok init_memory_mapping(unsigned long start,
 		 * located on different 2M pages. cleanup_highmap(), however,
 		 * can only consider _end when it runs, so destroy any
 		 * mappings beyond _brk_end here.
+		 * Be careful not to go over _end.
 		 */
 		pud = pud_offset(pgd_offset_k(_brk_end), _brk_end);
 		pmd = pmd_offset(pud, _brk_end - 1);
-		while (++pmd <= pmd_offset(pud, (unsigned long)_end - 1))
+		while (++pmd < pmd_offset(pud, (unsigned long)_end - 1))
+			pmd_clear(pmd);
+		if (((unsigned long)_end) & ~PMD_MASK) {
+			pte_t *pte;
+			unsigned long addr;
+			for (addr = ((unsigned long)_end) & PMD_MASK;
+					addr < ((unsigned long)_end);
+					addr += PAGE_SIZE) {
+				pte = pte_offset_map(pmd, addr);
+				pte_clear(&init_mm, addr, pte);
+				pte_unmap(pte);
+			}
+		} else
 			pmd_clear(pmd);
 	}
 #endif

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-24 14:14                   ` Konrad Rzeszutek Wilk
@ 2011-01-24 23:12                     ` M A Young
  2011-01-25 12:03                       ` Stefano Stabellini
  0 siblings, 1 reply; 31+ messages in thread
From: M A Young @ 2011-01-24 23:12 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 377 bytes --]

On Mon, 24 Jan 2011, Konrad Rzeszutek Wilk wrote:

> We can fix how the E820 is done.
> Look in arch/x86/xen/setup.c for 'xen_memory_setup' function.
> Try to wrap make map[i].size be = map[i].szie & ~(PAGE_SIZE-1)
> that should trim off the last 2048 bytes.

The attached patch works for me, though it does assume the memory region 
starts on a page boundary.

 	Michael Young

[-- Attachment #2: Type: TEXT/PLAIN, Size: 549 bytes --]

--- a/arch/x86/xen/setup.c	2011-01-05 00:50:19.000000000 +0000
+++ b/arch/x86/xen/setup.c	2011-01-24 20:29:23.000000000 +0000
@@ -179,7 +179,10 @@
 	e820.nr_map = 0;
 	xen_extra_mem_start = mem_end;
 	for (i = 0; i < memmap.nr_entries; i++) {
-		unsigned long long end = map[i].addr + map[i].size;
+		unsigned long long end;
+		if (map[i].type == E820_RAM)
+			map[i].size &= ~(PAGE_SIZE-1);
+		end = map[i].addr + map[i].size;
 
 		if (map[i].type == E820_RAM && end > mem_end) {
 			/* RAM off the end - may be partially included */

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-24 19:04                   ` Stefano Stabellini
@ 2011-01-25  0:22                     ` M A Young
  0 siblings, 0 replies; 31+ messages in thread
From: M A Young @ 2011-01-25  0:22 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Konrad Rzeszutek Wilk

On Mon, 24 Jan 2011, Stefano Stabellini wrote:

> I have a work-in-progress patch that fixes a booting issue on one of my
> testboxes. Could you please give it a try, passing dom0_mem=700M to the
> Xen command line?

It wouldn't prove anything in my case as booting with dom0_mem=700M works.

 	Michael Young

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-24 23:12                     ` M A Young
@ 2011-01-25 12:03                       ` Stefano Stabellini
  2011-01-25 13:24                         ` Ian Campbell
  0 siblings, 1 reply; 31+ messages in thread
From: Stefano Stabellini @ 2011-01-25 12:03 UTC (permalink / raw)
  To: M A Young; +Cc: xen-devel, Konrad Rzeszutek Wilk

On Mon, 24 Jan 2011, M A Young wrote:
> On Mon, 24 Jan 2011, Konrad Rzeszutek Wilk wrote:
> 
> > We can fix how the E820 is done.
> > Look in arch/x86/xen/setup.c for 'xen_memory_setup' function.
> > Try to wrap make map[i].size be = map[i].szie & ~(PAGE_SIZE-1)
> > that should trim off the last 2048 bytes.
> 
> The attached patch works for me, though it does assume the memory region 
> starts on a page boundary.

It turns out that it is me having the same issue you have and not the
other way around :)

Your patch (in addition to my previous patch) makes my testbox boot, no
matter what dom0_mem parameter I choose.

Appended is a version of the patch that doesn't assume that the memory
region starts on a page boundary.

---

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index b5a7f92..a3d28a1 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -179,7 +179,10 @@ char * __init xen_memory_setup(void)
 	e820.nr_map = 0;
 	xen_extra_mem_start = mem_end;
 	for (i = 0; i < memmap.nr_entries; i++) {
-		unsigned long long end = map[i].addr + map[i].size;
+		unsigned long long end;
+		if (map[i].type == E820_RAM)
+			map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE;
+		end = map[i].addr + map[i].size;
 
 		if (map[i].type == E820_RAM && end > mem_end) {
 			/* RAM off the end - may be partially included */

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-25 12:03                       ` Stefano Stabellini
@ 2011-01-25 13:24                         ` Ian Campbell
  2011-01-25 13:31                           ` Stefano Stabellini
  0 siblings, 1 reply; 31+ messages in thread
From: Ian Campbell @ 2011-01-25 13:24 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Konrad Rzeszutek Wilk, M A Young

On Tue, 2011-01-25 at 12:03 +0000, Stefano Stabellini wrote:
> On Mon, 24 Jan 2011, M A Young wrote:
> > On Mon, 24 Jan 2011, Konrad Rzeszutek Wilk wrote:
> > 
> > > We can fix how the E820 is done.
> > > Look in arch/x86/xen/setup.c for 'xen_memory_setup' function.
> > > Try to wrap make map[i].size be = map[i].szie & ~(PAGE_SIZE-1)
> > > that should trim off the last 2048 bytes.
> > 
> > The attached patch works for me, though it does assume the memory region 
> > starts on a page boundary.
> 
> It turns out that it is me having the same issue you have and not the
> other way around :)
> 
> Your patch (in addition to my previous patch) makes my testbox boot, no
> matter what dom0_mem parameter I choose.
> 
> Appended is a version of the patch that doesn't assume that the memory
> region starts on a page boundary.
> 
> ---
> 
> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
> index b5a7f92..a3d28a1 100644
> --- a/arch/x86/xen/setup.c
> +++ b/arch/x86/xen/setup.c
> @@ -179,7 +179,10 @@ char * __init xen_memory_setup(void)
>  	e820.nr_map = 0;
>  	xen_extra_mem_start = mem_end;
>  	for (i = 0; i < memmap.nr_entries; i++) {
> -		unsigned long long end = map[i].addr + map[i].size;
> +		unsigned long long end;
> +		if (map[i].type == E820_RAM)
> +			map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE;

The more normal idiom to round down to a page boundary in the kernel is:
	map[i].size &= ~(PAGE_SIZE-1);

Do you also need to page align map[i].addr upwards for maximum safety?

Ian.

> +		end = map[i].addr + map[i].size;

>  
>  		if (map[i].type == E820_RAM && end > mem_end) {
>  			/* RAM off the end - may be partially included */
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-25 13:24                         ` Ian Campbell
@ 2011-01-25 13:31                           ` Stefano Stabellini
  2011-01-25 13:45                             ` Ian Campbell
  0 siblings, 1 reply; 31+ messages in thread
From: Stefano Stabellini @ 2011-01-25 13:31 UTC (permalink / raw)
  To: Ian Campbell
  Cc: M A Young, xen-devel, Konrad Rzeszutek Wilk, Stefano Stabellini

On Tue, 25 Jan 2011, Ian Campbell wrote:
> > It turns out that it is me having the same issue you have and not the
> > other way around :)
> > 
> > Your patch (in addition to my previous patch) makes my testbox boot, no
> > matter what dom0_mem parameter I choose.
> > 
> > Appended is a version of the patch that doesn't assume that the memory
> > region starts on a page boundary.
> > 
> > ---
> > 
> > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
> > index b5a7f92..a3d28a1 100644
> > --- a/arch/x86/xen/setup.c
> > +++ b/arch/x86/xen/setup.c
> > @@ -179,7 +179,10 @@ char * __init xen_memory_setup(void)
> >  	e820.nr_map = 0;
> >  	xen_extra_mem_start = mem_end;
> >  	for (i = 0; i < memmap.nr_entries; i++) {
> > -		unsigned long long end = map[i].addr + map[i].size;
> > +		unsigned long long end;
> > +		if (map[i].type == E820_RAM)
> > +			map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE;
> 
> The more normal idiom to round down to a page boundary in the kernel is:
> 	map[i].size &= ~(PAGE_SIZE-1);
> 
> Do you also need to page align map[i].addr upwards for maximum safety?
> 

unless I am very confused

map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE

is not the same as:

as map[i].size &= ~(PAGE_SIZE-1): 

because it also takes into account the possibility that map[i].addr is
not page aligned. It doesn't move map[i].addr upward but still makes sure that
the region ends at a page boundary anyway.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-25 13:31                           ` Stefano Stabellini
@ 2011-01-25 13:45                             ` Ian Campbell
  2011-01-25 15:19                               ` Stefano Stabellini
  0 siblings, 1 reply; 31+ messages in thread
From: Ian Campbell @ 2011-01-25 13:45 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Konrad Rzeszutek Wilk, M A Young

On Tue, 2011-01-25 at 13:31 +0000, Stefano Stabellini wrote:
> On Tue, 25 Jan 2011, Ian Campbell wrote:
> > > It turns out that it is me having the same issue you have and not the
> > > other way around :)
> > > 
> > > Your patch (in addition to my previous patch) makes my testbox boot, no
> > > matter what dom0_mem parameter I choose.
> > > 
> > > Appended is a version of the patch that doesn't assume that the memory
> > > region starts on a page boundary.
> > > 
> > > ---
> > > 
> > > diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
> > > index b5a7f92..a3d28a1 100644
> > > --- a/arch/x86/xen/setup.c
> > > +++ b/arch/x86/xen/setup.c
> > > @@ -179,7 +179,10 @@ char * __init xen_memory_setup(void)
> > >  	e820.nr_map = 0;
> > >  	xen_extra_mem_start = mem_end;
> > >  	for (i = 0; i < memmap.nr_entries; i++) {
> > > -		unsigned long long end = map[i].addr + map[i].size;
> > > +		unsigned long long end;
> > > +		if (map[i].type == E820_RAM)
> > > +			map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE;
> > 
> > The more normal idiom to round down to a page boundary in the kernel is:
> > 	map[i].size &= ~(PAGE_SIZE-1);
> > 
> > Do you also need to page align map[i].addr upwards for maximum safety?
> > 
> 
> unless I am very confused
> 
> map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE
> 
> is not the same as:
> 
> as map[i].size &= ~(PAGE_SIZE-1): 
> 
> because it also takes into account the possibility that map[i].addr is
> not page aligned.

Oh yes, I didn't notice that aspect of it.

>  It doesn't move map[i].addr upward but still makes sure that
> the region ends at a page boundary anyway.

Which returns to my second question ;-) Why do we not need to align addr
too?

Ian.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-25 13:45                             ` Ian Campbell
@ 2011-01-25 15:19                               ` Stefano Stabellini
  2011-01-25 15:52                                 ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 31+ messages in thread
From: Stefano Stabellini @ 2011-01-25 15:19 UTC (permalink / raw)
  To: Ian Campbell
  Cc: M A Young, xen-devel, Konrad Rzeszutek Wilk, Stefano Stabellini

On Tue, 25 Jan 2011, Ian Campbell wrote:
> > unless I am very confused
> > 
> > map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE
> > 
> > is not the same as:
> > 
> > as map[i].size &= ~(PAGE_SIZE-1): 
> > 
> > because it also takes into account the possibility that map[i].addr is
> > not page aligned.
> 
> Oh yes, I didn't notice that aspect of it.
> 
> >  It doesn't move map[i].addr upward but still makes sure that
> > the region ends at a page boundary anyway.
> 
> Which returns to my second question ;-) Why do we not need to align addr
> too?

My machine can boot fine with a map[i].addr not page aligned.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-25 15:19                               ` Stefano Stabellini
@ 2011-01-25 15:52                                 ` Konrad Rzeszutek Wilk
  2011-01-25 15:56                                   ` Stefano Stabellini
  0 siblings, 1 reply; 31+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-01-25 15:52 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Ian Campbell, xen-devel, M A Young

On Tue, Jan 25, 2011 at 03:19:22PM +0000, Stefano Stabellini wrote:
> On Tue, 25 Jan 2011, Ian Campbell wrote:
> > > unless I am very confused
> > > 
> > > map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE
> > > 
> > > is not the same as:
> > > 
> > > as map[i].size &= ~(PAGE_SIZE-1): 
> > > 
> > > because it also takes into account the possibility that map[i].addr is
> > > not page aligned.
> > 
> > Oh yes, I didn't notice that aspect of it.
> > 
> > >  It doesn't move map[i].addr upward but still makes sure that
> > > the region ends at a page boundary anyway.
> > 
> > Which returns to my second question ;-) Why do we not need to align addr
> > too?
> 
> My machine can boot fine with a map[i].addr not page aligned.

OK, so then the patch that M A Young came up with ought to do it?

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-25 15:52                                 ` Konrad Rzeszutek Wilk
@ 2011-01-25 15:56                                   ` Stefano Stabellini
  2011-01-25 16:05                                     ` M A Young
  0 siblings, 1 reply; 31+ messages in thread
From: Stefano Stabellini @ 2011-01-25 15:56 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Ian Campbell, M A Young, xen-devel, Stefano Stabellini

On Tue, 25 Jan 2011, Konrad Rzeszutek Wilk wrote:
> On Tue, Jan 25, 2011 at 03:19:22PM +0000, Stefano Stabellini wrote:
> > On Tue, 25 Jan 2011, Ian Campbell wrote:
> > > > unless I am very confused
> > > > 
> > > > map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE
> > > > 
> > > > is not the same as:
> > > > 
> > > > as map[i].size &= ~(PAGE_SIZE-1): 
> > > > 
> > > > because it also takes into account the possibility that map[i].addr is
> > > > not page aligned.
> > > 
> > > Oh yes, I didn't notice that aspect of it.
> > > 
> > > >  It doesn't move map[i].addr upward but still makes sure that
> > > > the region ends at a page boundary anyway.
> > > 
> > > Which returns to my second question ;-) Why do we not need to align addr
> > > too?
> > 
> > My machine can boot fine with a map[i].addr not page aligned.
> 
> OK, so then the patch that M A Young came up with ought to do it?
> 

I think you need the slightly improved version I posted before that can
handle map[i].addr not page aligned (I silently added a s-o-b Young, I
hope he's OK with this).

---


commit b84683ad1e704c2a296d08ff0cbe29db936f94a7
Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Date:   Tue Jan 25 12:03:42 2011 +0000

    xen: make sure the e820 memory regions end at page boundary
    
    Signed-off-by: M A Young <m.a.young@durham.ac.uk>
    Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index b5a7f92..a3d28a1 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -179,7 +179,10 @@ char * __init xen_memory_setup(void)
 	e820.nr_map = 0;
 	xen_extra_mem_start = mem_end;
 	for (i = 0; i < memmap.nr_entries; i++) {
-		unsigned long long end = map[i].addr + map[i].size;
+		unsigned long long end;
+		if (map[i].type == E820_RAM)
+			map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE;
+		end = map[i].addr + map[i].size;
 
 		if (map[i].type == E820_RAM && end > mem_end) {
 			/* RAM off the end - may be partially included */

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-25 15:56                                   ` Stefano Stabellini
@ 2011-01-25 16:05                                     ` M A Young
  0 siblings, 0 replies; 31+ messages in thread
From: M A Young @ 2011-01-25 16:05 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Ian Campbell, xen-devel, Konrad Rzeszutek Wilk

On Tue, 25 Jan 2011, Stefano Stabellini wrote:

> I think you need the slightly improved version I posted before that can
> handle map[i].addr not page aligned (I silently added a s-o-b Young, I
> hope he's OK with this).

Yes and yes. My version doesn't work if map[i].addr is not page aligned. 
The aim is to make sure the end address is page aligned, and avoid ending 
with a partial page which won't have a PFN and might also require 
different treatment if there is reserved content in the rest of the page 
(which is true in my case).

 	Michael Young

> commit b84683ad1e704c2a296d08ff0cbe29db936f94a7
> Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Date:   Tue Jan 25 12:03:42 2011 +0000
>
>    xen: make sure the e820 memory regions end at page boundary
>
>    Signed-off-by: M A Young <m.a.young@durham.ac.uk>
>    Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
>
> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
> index b5a7f92..a3d28a1 100644
> --- a/arch/x86/xen/setup.c
> +++ b/arch/x86/xen/setup.c
> @@ -179,7 +179,10 @@ char * __init xen_memory_setup(void)
> 	e820.nr_map = 0;
> 	xen_extra_mem_start = mem_end;
> 	for (i = 0; i < memmap.nr_entries; i++) {
> -		unsigned long long end = map[i].addr + map[i].size;
> +		unsigned long long end;
> +		if (map[i].type == E820_RAM)
> +			map[i].size -= (map[i].size + map[i].addr) % PAGE_SIZE;
> +		end = map[i].addr + map[i].size;
>
> 		if (map[i].type == E820_RAM && end > mem_end) {
> 			/* RAM off the end - may be partially included */
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-07  0:37       ` M A Young
@ 2011-01-07 19:18         ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 31+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-01-07 19:18 UTC (permalink / raw)
  To: M A Young; +Cc: xen-devel

On Fri, Jan 07, 2011 at 12:37:36AM +0000, M A Young wrote:
> On Thu, 6 Jan 2011, Konrad Rzeszutek Wilk wrote:
> 
> >Ok, I think we need a serial output. I don't remember if you said that
> >your docking station has a serial port or not.
> 
> I don't have any good way of getting a serial port on this computer.
> I have however managed to get output on the screen and have a poor
> quality photo. The relevant lines looks like
> BUG unable to handle kernel NULL pointer dereference at
> IP: [<ffffffff81b69b92>] setup_node_bootmem+0x16b/0x199

Hmmm, I did see something similar to this in 2.6.37-rc1, but we fixed
that quickly. It was triggered by having 4GB of memory or so and
the work-around was to use dom0_mem=max:2GB.

Can you send the photo? Maybe the calleer stack will shed some light.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-06 14:56     ` Konrad Rzeszutek Wilk
@ 2011-01-07  0:37       ` M A Young
  2011-01-07 19:18         ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 31+ messages in thread
From: M A Young @ 2011-01-07  0:37 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

On Thu, 6 Jan 2011, Konrad Rzeszutek Wilk wrote:

> Ok, I think we need a serial output. I don't remember if you said that
> your docking station has a serial port or not.

I don't have any good way of getting a serial port on this computer. I 
have however managed to get output on the screen and have a poor quality 
photo. The relevant lines looks like
BUG unable to handle kernel NULL pointer dereference at
IP: [<ffffffff81b69b92>] setup_node_bootmem+0x16b/0x199

 	Michael Young

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-05 23:11   ` M A Young
@ 2011-01-06 14:56     ` Konrad Rzeszutek Wilk
  2011-01-07  0:37       ` M A Young
  0 siblings, 1 reply; 31+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-01-06 14:56 UTC (permalink / raw)
  To: M A Young; +Cc: xen-devel

On Wed, Jan 05, 2011 at 11:11:03PM +0000, M A Young wrote:
> On Wed, 5 Jan 2011, Konrad Rzeszutek Wilk wrote:
> 
> >Ahh, I hit this. Can you try 'stable/bug-fixes' branch of mine?
> >It has "xen/irq: Don't fall over when nr_irqs_gsi > nr_irqs." patch
> >which will fix the below problem you are seeing.
> >
> >But I am not sure if it fixes the problem you are having with hardware?
> 
> That fixes the kvm boot, but unfortunately booting directly on the
> hardware doesn't. Incidentally it is definitely turning debug
> options off that trigger the crash, as I realized I was building a
> kernel-debug package as well as a kernel package from the same

Ok, I think we need a serial output. I don't remember if you said that
your docking station has a serial port or not.

If the docking station does not, this card ought to do the trick:

http://www.newegg.com/Product/Product.aspx?Item=N82E16839328018&Tpk=SDEXP15005

You can use under Xen as a normal PCI type serial card. For details:

http://wiki.xensource.com/xenwiki/XenSerialConsole

> source RPM, and it boots with the debug kernel but not the ordinary
> kernel.

> 
> 	Michael Young

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-05 15:43 ` Konrad Rzeszutek Wilk
@ 2011-01-05 23:11   ` M A Young
  2011-01-06 14:56     ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 31+ messages in thread
From: M A Young @ 2011-01-05 23:11 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

On Wed, 5 Jan 2011, Konrad Rzeszutek Wilk wrote:

> Ahh, I hit this. Can you try 'stable/bug-fixes' branch of mine?
> It has "xen/irq: Don't fall over when nr_irqs_gsi > nr_irqs." patch
> which will fix the below problem you are seeing.
>
> But I am not sure if it fixes the problem you are having with hardware?

That fixes the kvm boot, but unfortunately booting directly on the 
hardware doesn't. Incidentally it is definitely turning debug options off 
that trigger the crash, as I realized I was building a kernel-debug 
package as well as a kernel package from the same source RPM, and it boots 
with the debug kernel but not the ordinary kernel.

 	Michael Young

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Crash on boot with 2.6.37-rc8-git3
  2011-01-04 22:01 M A Young
@ 2011-01-05 15:43 ` Konrad Rzeszutek Wilk
  2011-01-05 23:11   ` M A Young
  0 siblings, 1 reply; 31+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-01-05 15:43 UTC (permalink / raw)
  To: M A Young; +Cc: xen-devel

On Tue, Jan 04, 2011 at 10:01:56PM +0000, M A Young wrote:
> The latest Fedora based 2.6.37 kernels have stopped booting for me
> under xen. They stopped working around -rc7 but I think the trigger
> is that various debug options were turned off. My hardware won't let
> me get serial output, so I have tried booting it within kvm, and got
> the attached output - the behaviour was similar to bare metal,
> though I don't see enough to know if it is exactly the same crash.
> The kernel used has no additional xen patches, though I am seeing
> similar behaviour for kernels with patches from xen-next-2.6.37. The
> crash looks like it is something to do with irq.

Ahh, I hit this. Can you try 'stable/bug-fixes' branch of mine?
It has "xen/irq: Don't fall over when nr_irqs_gsi > nr_irqs." patch
which will fix the below problem you are seeing.

But I am not sure if it fixes the problem you are having with hardware?

(git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git)

..
> [    0.008220] ------------[ cut here ]------------
> [    0.008999] WARNING: at drivers/xen/events.c:432 find_unbound_irq+0x88/0x9f()
> [    0.008999] Hardware name: Bochs
> [    0.008999] Modules linked in:
> [    0.008999] Pid: 1, comm: swapper Not tainted 2.6.37-0.rc8.git3.1.fc15.x86_64 #1
> [    0.008999] Call Trace:
> [    0.008999]  [<ffffffff810505d7>] warn_slowpath_common+0x85/0x9d
> [    0.008999]  [<ffffffff81050609>] warn_slowpath_null+0x1a/0x1c
> [    0.008999]  [<ffffffff812abfea>] find_unbound_irq+0x88/0x9f
> [    0.008999]  [<ffffffff812ac90e>] bind_ipi_to_irqhandler+0x64/0x153
> [    0.008999]  [<ffffffff81007979>] ? xen_reschedule_interrupt+0x0/0x18
> [    0.008999]  [<ffffffff81234511>] ? kasprintf+0x38/0x3b
> [    0.008999]  [<ffffffff81007b92>] xen_smp_intr_init+0x46/0x1f3
> [    0.008999]  [<ffffffff81b5839a>] xen_smp_prepare_cpus+0x3d/0x107
> [    0.008999]  [<ffffffff81b53cf3>] kernel_init+0x92/0x2b6
> [    0.008999]  [<ffffffff8100bae4>] kernel_thread_helper+0x4/0x10
> [    0.008999]  [<ffffffff8100aee3>] ? int_ret_from_sys_call+0x7/0x1b
> [    0.008999]  [<ffffffff81477edd>] ? retint_restore_args+0x5/0x6
> [    0.008999]  [<ffffffff8100bae0>] ? kernel_thread_helper+0x0/0x10
> [    0.008999] ---[ end trace a7919e7f17c0a725 ]---
> [    0.008999] ------------[ cut here ]------------
> [    0.008999] WARNING: at kernel/irq/manage.c:904 __free_irq+0xa3/0x1ab()
> [    0.008999] Hardware name: Bochs
> [    0.008999] Trying to free already-free IRQ 0
> [    0.008999] Modules linked in:
> [    0.008999] Pid: 1, comm: swapper Tainted: G        W   2.6.37-0.rc8.git3.1.fc15.x86_64 #1
> [    0.008999] Call Trace:
> [    0.008999]  [<ffffffff810505d7>] warn_slowpath_common+0x85/0x9d
> [    0.008999]  [<ffffffff81050692>] warn_slowpath_fmt+0x46/0x48
> [    0.008999]  [<ffffffff8107d246>] ? arch_local_irq_save+0x18/0x1e
> [    0.008999]  [<ffffffff810ac901>] __free_irq+0xa3/0x1ab
> [    0.008999]  [<ffffffff810aca41>] free_irq+0x38/0x50
> [    0.008999]  [<ffffffff812abead>] unbind_from_irqhandler+0x15/0x20
> [    0.008999]  [<ffffffff81007cce>] xen_smp_intr_init+0x182/0x1f3
> [    0.008999]  [<ffffffff81b5839a>] xen_smp_prepare_cpus+0x3d/0x107
> [    0.008999]  [<ffffffff81b53cf3>] kernel_init+0x92/0x2b6
> [    0.008999]  [<ffffffff8100bae4>] kernel_thread_helper+0x4/0x10
> [    0.008999]  [<ffffffff8100aee3>] ? int_ret_from_sys_call+0x7/0x1b
> [    0.008999]  [<ffffffff81477edd>] ? retint_restore_args+0x5/0x6
> [    0.008999]  [<ffffffff8100bae0>] ? kernel_thread_helper+0x0/0x10
> [    0.008999] ---[ end trace a7919e7f17c0a726 ]---
> [    0.008999] ------------[ cut here ]------------
> [    0.008999] WARNING: at kernel/irq/manage.c:904 __free_irq+0xa3/0x1ab()
> [    0.008999] Hardware name: Bochs
> [    0.008999] Trying to free already-free IRQ 0
> [    0.008999] Modules linked in:
> [    0.008999] Pid: 1, comm: swapper Tainted: G        W   2.6.37-0.rc8.git3.1.fc15.x86_64 #1
> [    0.008999] Call Trace:
> [    0.008999]  [<ffffffff810505d7>] warn_slowpath_common+0x85/0x9d
> [    0.008999]  [<ffffffff81050692>] warn_slowpath_fmt+0x46/0x48
> [    0.008999]  [<ffffffff8107d246>] ? arch_local_irq_save+0x18/0x1e
> [    0.008999]  [<ffffffff810ac901>] __free_irq+0xa3/0x1ab
> [    0.008999]  [<ffffffff810aca41>] free_irq+0x38/0x50
> [    0.008999]  [<ffffffff812abead>] unbind_from_irqhandler+0x15/0x20
> [    0.008999]  [<ffffffff81007cf0>] xen_smp_intr_init+0x1a4/0x1f3
> [    0.008999]  [<ffffffff81b5839a>] xen_smp_prepare_cpus+0x3d/0x107
> [    0.008999]  [<ffffffff81b53cf3>] kernel_init+0x92/0x2b6
> [    0.008999]  [<ffffffff8100bae4>] kernel_thread_helper+0x4/0x10
> [    0.008999]  [<ffffffff8100aee3>] ? int_ret_from_sys_call+0x7/0x1b
> [    0.008999]  [<ffffffff81477edd>] ? retint_restore_args+0x5/0x6
> [    0.008999]  [<ffffffff8100bae0>] ? kernel_thread_helper+0x0/0x10
> [    0.008999] ---[ end trace a7919e7f17c0a727 ]---
> [    0.008999] ------------[ cut here ]------------
> [    0.008999] WARNING: at kernel/irq/manage.c:904 __free_irq+0xa3/0x1ab()
> [    0.008999] Hardware name: Bochs
> [    0.008999] Trying to free already-free IRQ 0
> [    0.008999] Modules linked in:
> [    0.008999] Pid: 1, comm: swapper Tainted: G        W   2.6.37-0.rc8.git3.1.fc15.x86_64 #1
> [    0.008999] Call Trace:
> [    0.008999]  [<ffffffff810505d7>] warn_slowpath_common+0x85/0x9d
> [    0.008999]  [<ffffffff81050692>] warn_slowpath_fmt+0x46/0x48
> [    0.008999]  [<ffffffff8107d246>] ? arch_local_irq_save+0x18/0x1e
> [    0.008999]  [<ffffffff810ac901>] __free_irq+0xa3/0x1ab
> [    0.008999]  [<ffffffff810aca41>] free_irq+0x38/0x50
> [    0.008999]  [<ffffffff812abead>] unbind_from_irqhandler+0x15/0x20
> [    0.008999]  [<ffffffff81007d34>] xen_smp_intr_init+0x1e8/0x1f3
> [    0.008999]  [<ffffffff81b5839a>] xen_smp_prepare_cpus+0x3d/0x107
> [    0.008999]  [<ffffffff81b53cf3>] kernel_init+0x92/0x2b6
> [    0.008999]  [<ffffffff8100bae4>] kernel_thread_helper+0x4/0x10
> [    0.008999]  [<ffffffff8100aee3>] ? int_ret_from_sys_call+0x7/0x1b
> [    0.008999]  [<ffffffff81477edd>] ? retint_restore_args+0x5/0x6
> [    0.008999]  [<ffffffff8100bae0>] ? kernel_thread_helper+0x0/0x10
> [    0.008999] ---[ end trace a7919e7f17c0a728 ]---
> [    0.009018] ------------[ cut here ]------------
> [    0.009999] kernel BUG at arch/x86/xen/smp.c:217!
> [    0.009999] invalid opcode: 0000 [#1] SMP 
> [    0.009999] last sysfs file: 
> [    0.009999] CPU 0 
> [    0.009999] Modules linked in:
> [    0.009999] 
> [    0.009999] Pid: 1, comm: swapper Tainted: G        W   2.6.37-0.rc8.git3.1.fc15.x86_64 #1 /Bochs
> [    0.009999] RIP: e030:[<ffffffff81b5839e>]  [<ffffffff81b5839e>] xen_smp_prepare_cpus+0x41/0x107
> [    0.009999] RSP: e02b:ffff880033841eb0  EFLAGS: 00010286
> [    0.009999] RAX: 00000000ffffffff RBX: ffffffff81c1c7b0 RCX: 0000000000000100
> [    0.009999] RDX: ffff88003a410000 RSI: 0000000000000000 RDI: ffffffff81d64d50
> [    0.009999] RBP: ffff880033841ed0 R08: 0000000000000002 R09: 00000000fffffffe
> [    0.009999] R10: ffff880033841e50 R11: 0000000000000000 R12: 0000000000000100
> [    0.009999] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [    0.009999] FS:  0000000000000000(0000) GS:ffff88003b063000(0000) knlGS:0000000000000000
> [    0.009999] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> [    0.009999] CR2: 0000000000000000 CR3: 0000000001a03000 CR4: 0000000000000660
> [    0.009999] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [    0.009999] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [    0.009999] Process swapper (pid: 1, threadinfo ffff880033840000, task ffff880033838000)
> [    0.009999] Stack:
> [    0.009999]  ffff880033838000 ffffffff81c1c7b0 0000000000000000 0000000000000000
> [    0.009999]  ffff880033841f40 ffffffff81b53cf3 0000000000000001 0000000000000000
> [    0.009999]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
> [    0.009999] Call Trace:
> [    0.009999]  [<ffffffff81b53cf3>] kernel_init+0x92/0x2b6
> [    0.009999]  [<ffffffff8100bae4>] kernel_thread_helper+0x4/0x10
> [    0.009999]  [<ffffffff8100aee3>] ? int_ret_from_sys_call+0x7/0x1b
> [    0.009999]  [<ffffffff81477edd>] ? retint_restore_args+0x5/0x6
> [    0.009999]  [<ffffffff8100bae0>] ? kernel_thread_helper+0x0/0x10
> [    0.009999] Code: ff 48 8b 15 25 b9 fd ff 31 ff 48 c7 c0 00 36 01 00 66 c7 84 10 c0 00 00 00 01 00 e8 3c 76 91 ff 31 ff e8 b2 f7 4a ff 85 c0 74 02 <0f> 0b 31 ff e8 a9 f5 4a ff 48 c7 c2 00 20 c3 81 b9 08 00 00 00 
> [    0.009999] RIP  [<ffffffff81b5839e>] xen_smp_prepare_cpus+0x41/0x107
> [    0.009999]  RSP <ffff880033841eb0>
> [    0.009999] ---[ end trace a7919e7f17c0a729 ]---
> [    0.010021] Kernel panic - not syncing: Attempted to kill init!
> [    0.010999] Pid: 1, comm: swapper Tainted: G      D W   2.6.37-0.rc8.git3.1.fc15.x86_64 #1
> [    0.010999] Call Trace:
> [    0.010999]  [<ffffffff814759d5>] panic+0x91/0x1a4
> [    0.010999]  [<ffffffff810d6093>] ? perf_event_exit_task+0xb8/0x1c7
> [    0.010999]  [<ffffffff81053b89>] do_exit+0x7c/0x75d
> [    0.010999]  [<ffffffff8107d21f>] ? arch_local_irq_restore+0xb/0xd
> [    0.010999]  [<ffffffff8147795f>] ? _raw_spin_unlock_irqrestore+0x17/0x19
> [    0.010999]  [<ffffffff8100022a>] ? _stext+0x9a/0xe70
> [    0.010999]  [<ffffffff81478c8b>] oops_end+0xbf/0xc7
> [    0.010999]  [<ffffffff8100022a>] ? _stext+0x9a/0xe70
> [    0.010999]  [<ffffffff8100022a>] ? _stext+0x9a/0xe70
> [    0.010999]  [<ffffffff8100e6ec>] die+0x5a/0x66
> [    0.010999]  [<ffffffff81478518>] do_trap+0x121/0x130
> [    0.010999]  [<ffffffff8100c06d>] do_invalid_op+0x98/0xa1
> [    0.010999]  [<ffffffff81b5839e>] ? xen_smp_prepare_cpus+0x41/0x107
> [    0.010999]  [<ffffffff8107d246>] ? arch_local_irq_save+0x18/0x1e
> [    0.010999]  [<ffffffff8107d21f>] ? arch_local_irq_restore+0xb/0xd
> [    0.010999]  [<ffffffff8147795f>] ? _raw_spin_unlock_irqrestore+0x17/0x19
> [    0.010999]  [<ffffffff810ac90d>] ? __free_irq+0xaf/0x1ab
> [    0.010999]  [<ffffffff8100b95b>] invalid_op+0x1b/0x20
> [    0.010999]  [<ffffffff81b5839e>] ? xen_smp_prepare_cpus+0x41/0x107
> [    0.010999]  [<ffffffff81b53cf3>] kernel_init+0x92/0x2b6
> [    0.010999]  [<ffffffff8100bae4>] kernel_thread_helper+0x4/0x10
> [    0.010999]  [<ffffffff8100aee3>] ? int_ret_from_sys_call+0x7/0x1b
> [    0.010999]  [<ffffffff81477edd>] ? retint_restore_args+0x5/0x6
> [    0.010999]  [<ffffffff8100bae0>] ? kernel_thread_helper+0x0/0x10
> (XEN) Domain 0 crashed: rebooting machine in 5 seconds.

> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Crash on boot with 2.6.37-rc8-git3
@ 2011-01-04 22:01 M A Young
  2011-01-05 15:43 ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 31+ messages in thread
From: M A Young @ 2011-01-04 22:01 UTC (permalink / raw)
  To: xen-devel; +Cc: Konrad Rzeszutek Wilk

[-- Attachment #1: Type: TEXT/PLAIN, Size: 610 bytes --]

The latest Fedora based 2.6.37 kernels have stopped booting for me under 
xen. They stopped working around -rc7 but I think the trigger is that 
various debug options were turned off. My hardware won't let me get serial 
output, so I have tried booting it within kvm, and got the attached output 
- the behaviour was similar to bare metal, though I don't see enough to 
know if it is exactly the same crash. The kernel used has no additional 
xen patches, though I am seeing similar behaviour for kernels with patches 
from xen-next-2.6.37. The crash looks like it is something to do with irq.

 	Michael Young

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: TEXT/PLAIN; charset=UTF-8; name=boot.log, Size: 21200 bytes --]

 __  __            _  _    ___   _     __     __      _ ____  
 \ \/ /___ _ __   | || |  / _ \ / |   / /_   / _| ___/ | ___| 
  \  // _ \ '_ \  | || |_| | | || |__| '_ \ | |_ / __| |___ \ 
  /  \  __/ | | | |__   _| |_| || |__| (_) ||  _| (__| |___) |
 /_/\_\___|_| |_|    |_|(_)___(_)_|   \___(_)_|  \___|_|____/ 
                                                              
(XEN) Xen version 4.0.1 (mockbuild@(none)) (gcc version 4.5.1 20100924 (Red Hat 4.5.1-4) (GCC) ) Tue Oct 12 21:36:26 UTC 2010
(XEN) Latest ChangeSet: unavailable
(XEN) Bootloader: ISOLINUX 4.02 2010-07-21 
(XEN) Command line: console=com1
(XEN) Video information:
(XEN)  VGA is text mode 80x25, font 8x16
(XEN) Disc information:
(XEN)  Found 0 MBR signatures
(XEN)  Found 0 EDD information structures
(XEN) Xen-e820 RAM map:
(XEN)  0000000000000000 - 000000000009f400 (usable)
(XEN)  000000000009f400 - 00000000000a0000 (reserved)
(XEN)  00000000000f0000 - 0000000000100000 (reserved)
(XEN)  0000000000100000 - 000000003fffd000 (usable)
(XEN)  000000003fffd000 - 0000000040000000 (reserved)
(XEN)  00000000fffbc000 - 0000000100000000 (reserved)
(XEN) System RAM: 1023MB (1048176kB)
(XEN) ACPI: RSDP 000F7B50, 0014 (r0 BOCHS )
(XEN) ACPI: RSDT 3FFFDE30, 0034 (r1 BOCHS  BXPCRSDT        1 BXPC        1)
(XEN) ACPI: FACP 3FFFFE70, 0074 (r1 BOCHS  BXPCFACP        1 BXPC        1)
(XEN) ACPI: DSDT 3FFFDFD0, 1E22 (r1   BXPC   BXDSDT        1 INTL 20090123)
(XEN) ACPI: FACS 3FFFFE00, 0040
(XEN) ACPI: SSDT 3FFFDF90, 0037 (r1 BOCHS  BXPCSSDT        1 BXPC        1)
(XEN) ACPI: APIC 3FFFDEB0, 0072 (r1 BOCHS  BXPCAPIC        1 BXPC        1)
(XEN) ACPI: HPET 3FFFDE70, 0038 (r1 BOCHS  BXPCHPET        1 BXPC        1)
(XEN) Domain heap initialised
(XEN) Processor #0 6:2 APIC version 20
(XEN) IOAPIC[0]: apic_id 1, version 17, address 0xfec00000, GSI 0-23
(XEN) Enabling APIC mode:  Flat.  Using 1 I/O APICs
(XEN) Using scheduler: SMP Credit Scheduler (credit)
(XEN) Detected 2393.988 MHz processor.
(XEN) Initing memory sharing.
(XEN) I/O virtualisation disabled
(XEN) Total of 1 processors activated.
(XEN) ENABLING IO-APIC IRQs
(XEN)  -> Using new ACK method
(XEN) Platform timer is 100.000MHz HPET
ÿ(XEN) Allocated console ring of 16 KiB.
(XEN) Brought up 1 CPUs
(XEN) *** LOADING DOMAIN 0 ***
(XEN)  Xen  kernel: 64-bit, lsb, compat32
(XEN)  Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x1ecf000
(XEN) PHYSICAL MEMORY ARRANGEMENT:
(XEN)  Dom0 alloc.:   0000000038000000->000000003c000000 (225506 pages to be allocated)
(XEN) VIRTUAL MEMORY ARRANGEMENT:
(XEN)  Loaded kernel: ffffffff81000000->ffffffff81ecf000
(XEN)  Init. ramdisk: ffffffff81ecf000->ffffffff83397000
(XEN)  Phys-Mach map: ffffffff83397000->ffffffff8356f710
(XEN)  Start info:    ffffffff83570000->ffffffff835704b4
(XEN)  Page tables:   ffffffff83571000->ffffffff83590000
(XEN)  Boot stack:    ffffffff83590000->ffffffff83591000
(XEN)  TOTAL:         ffffffff80000000->ffffffff83800000
(XEN)  ENTRY ADDRESS: ffffffff81b53200
(XEN) Dom0 has maximum 1 VCPUs
(XEN) Scrubbing Free RAM: done.
(XEN) Xen trace buffers: disabled
(XEN) Std. Loglevel: Errors and warnings
(XEN) Guest Loglevel: Nothing (Rate-limited: Errors and warnings)
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to Xen)
(XEN) Freed 172kB init memory.
mapping kernel into physical memory
Xen: setup ISA identity maps
about to get started...
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Linux version 2.6.37-0.rc8.git3.1.fc15.x86_64 (mockbuild@x86-14.phx2.fedoraproject.org) (gcc version 4.5.1 20101130 (Red Hat 4.5.1-6) (GCC) ) #1 SMP Mon Jan 3 16:15:26 UTC 2011
[    0.000000] Command line: root=live:CDLABEL=livecd.ks-x86_64-201101042112 rootfstype=auto ro liveimg console=hvc0 rd_NO_LUKS rd_NO_MD rd_NO_DM
[    0.000000] released 0 pages of unused memory
[    0.000000] BIOS-provided physical RAM map:
[    0.000000]  Xen: 0000000000000000 - 000000000009f400 (usable)
[    0.000000]  Xen: 000000000009f400 - 0000000000100000 (reserved)
[    0.000000]  Xen: 0000000000100000 - 000000003b0e2000 (usable)
[    0.000000]  Xen: 000000003fffd000 - 0000000040000000 (reserved)
[    0.000000]  Xen: 00000000fec00000 - 00000000fec01000 (reserved)
[    0.000000]  Xen: 00000000fee00000 - 00000000fee01000 (reserved)
[    0.000000]  Xen: 00000000fffbc000 - 0000000100000000 (reserved)
[    0.000000]  Xen: 0000000100000000 - 0000000104f1b000 (usable)
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] DMI 2.4 present.
[    0.000000] No AGP bridge found
[    0.000000] last_pfn = 0x104f1b max_arch_pfn = 0x400000000
[    0.000000] last_pfn = 0x3b0e2 max_arch_pfn = 0x400000000
[    0.000000] found SMP MP-table at [ffff8800000f7ba0] f7ba0
[    0.000000] init_memory_mapping: 0000000000000000-000000003b0e2000
[    0.000000] init_memory_mapping: 0000000100000000-0000000104f1b000
[    0.000000] RAMDISK: 01ecf000 - 03397000
[    0.000000] ACPI: RSDP 00000000000f7b50 00014 (v00 BOCHS )
[    0.000000] ACPI: RSDT 000000003fffde30 00034 (v01 BOCHS  BXPCRSDT 00000001 BXPC 00000001)
[    0.000000] ACPI: FACP 000000003ffffe70 00074 (v01 BOCHS  BXPCFACP 00000001 BXPC 00000001)
[    0.000000] ACPI: DSDT 000000003fffdfd0 01E22 (v01   BXPC   BXDSDT 00000001 INTL 20090123)
[    0.000000] ACPI: FACS 000000003ffffe00 00040
[    0.000000] ACPI: SSDT 000000003fffdf90 00037 (v01 BOCHS  BXPCSSDT 00000001 BXPC 00000001)
[    0.000000] ACPI: APIC 000000003fffdeb0 00072 (v01 BOCHS  BXPCAPIC 00000001 BXPC 00000001)
[    0.000000] ACPI: HPET 000000003fffde70 00038 (v01 BOCHS  BXPCHPET 00000001 BXPC 00000001)
[    0.000000] No NUMA configuration found
[    0.000000] Faking a node at 0000000000000000-0000000104f1b000
[    0.000000] Initmem setup node 0 0000000000000000-0000000104f1b000
[    0.000000]   NODE_DATA [000000003b0ce000 - 000000003b0e1fff]
[    0.000000] Zone PFN ranges:
[    0.000000]   DMA      0x00000010 -> 0x00001000
[    0.000000]   DMA32    0x00001000 -> 0x00100000
[    0.000000]   Normal   0x00100000 -> 0x00104f1b
[    0.000000] Movable zone start PFN for each node
[    0.000000] early_node_map[3] active PFN ranges
[    0.000000]     0: 0x00000010 -> 0x0000009f
[    0.000000]     0: 0x00000100 -> 0x0003b0e2
[    0.000000]     0: 0x00100000 -> 0x00104f1b
[    0.000000] ACPI: PM-Timer IO Port: 0xb008
[    0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
[    0.000000] BIOS bug, APIC version is 0 for CPU#0! fixing up to 0x10. (tell your hw vendor)
[    0.000000] ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
[    0.000000] IOAPIC[0]: apic_id 1, version 255, address 0xfec00000, GSI 0-255
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level)
[    0.000000] Using ACPI (MADT) for SMP configuration information
[    0.000000] ACPI: HPET id: 0x8086a201 base: 0xfed00000
[    0.000000] SMP: Allowing 1 CPUs, 0 hotplug CPUs
[    0.000000] PM: Registered nosave memory: 000000000009f000 - 00000000000a0000
[    0.000000] PM: Registered nosave memory: 00000000000a0000 - 0000000000100000
[    0.000000] PM: Registered nosave memory: 000000003b0e2000 - 000000003fffd000
[    0.000000] PM: Registered nosave memory: 000000003fffd000 - 0000000040000000
[    0.000000] PM: Registered nosave memory: 0000000040000000 - 00000000fec00000
[    0.000000] PM: Registered nosave memory: 00000000fec00000 - 00000000fec01000
[    0.000000] PM: Registered nosave memory: 00000000fec01000 - 00000000fee00000
[    0.000000] PM: Registered nosave memory: 00000000fee00000 - 00000000fee01000
[    0.000000] PM: Registered nosave memory: 00000000fee01000 - 00000000fffbc000
[    0.000000] PM: Registered nosave memory: 00000000fffbc000 - 0000000100000000
[    0.000000] Allocating PCI resources starting at 40000000 (gap: 40000000:bec00000)
[    0.000000] Booting paravirtualized kernel on Xen
[    0.000000] Xen version: 4.0.1 (preserve-AD)
[    0.000000] setup_percpu: NR_CPUS:256 nr_cpumask_bits:256 nr_cpu_ids:1 nr_node_ids:1
[    0.000000] PERCPU: Embedded 28 pages/cpu @ffff88003b063000 s83008 r8192 d23488 u114688
[    0.000000] Built 1 zonelists in Node order, mobility grouping on.  Total pages: 247409
[    0.000000] Policy zone: Normal
[    0.000000] Kernel command line: root=live:CDLABEL=livecd.ks-x86_64-201101042112 rootfstype=auto ro liveimg console=hvc0 rd_NO_LUKS rd_NO_MD rd_NO_DM
[    0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
[    0.000000] Placing 64MB software IO TLB between ffff880035200000 - ffff880039200000
[    0.000000] software IO TLB at phys 0x35200000 - 0x39200000
[    0.000000] Memory: 845176k/4275308k available (4608k kernel code, 3227196k absent, 202936k reserved, 6898k data, 924k init)
[    0.000000] SLUB: Genslabs=15, HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[    0.000000] Hierarchical RCU implementation.
[    0.000000] 	RCU dyntick-idle grace-period acceleration is enabled.
[    0.000000] 	RCU-based detection of stalled CPUs is disabled.
[    0.000000] NR_IRQS:16640 nr_irqs:256 16
[    0.000000] xen: sci override: global_irq=9 trigger=0 polarity=0
[    0.000000] xen: acpi sci 9
[    0.000000] xen_map_pirq_gsi: returning irq 9 for gsi 9
[    0.000000] Console: colour dummy device 80x25
[    0.000000] console [hvc0] enabled
[    0.000000] allocated 11796480 bytes of page_cgroup
[    0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups
[    0.000000] installing Xen timer for CPU 0
[    0.000000] Detected 2393.988 MHz processor.
[    0.000999] Calibrating delay loop (skipped), value calculated using timer frequency.. 4787.97 BogoMIPS (lpj=2393988)
[    0.000999] pid_max: default: 32768 minimum: 301
[    0.000999] Security Framework initialized
[    0.000999] SELinux:  Initializing.
[    0.001608] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
[    0.002088] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
[    0.002999] Mount-cache hash table entries: 256
[    0.003270] Initializing cgroup subsys ns
[    0.003999] ns_cgroup deprecated: consider using the 'clone_children' flag without the ns_cgroup.
[    0.003999] Initializing cgroup subsys cpuacct
[    0.003999] Initializing cgroup subsys memory
[    0.004034] Initializing cgroup subsys devices
[    0.004999] Initializing cgroup subsys freezer
[    0.004999] Initializing cgroup subsys net_cls
[    0.004999] Initializing cgroup subsys blkio
[    0.005288] Performance Events: unsupported p6 CPU model 2 no PMU driver, software events only.
[    0.005999] SMP alternatives: switching to UP code
[    0.005999] Freeing SMP alternatives: 12k freed
[    0.005999] ACPI: Core revision 20101013
[    0.007881] ftrace: allocating 24322 entries in 96 pages
[    0.008220] ------------[ cut here ]------------
[    0.008999] WARNING: at drivers/xen/events.c:432 find_unbound_irq+0x88/0x9f()
[    0.008999] Hardware name: Bochs
[    0.008999] Modules linked in:
[    0.008999] Pid: 1, comm: swapper Not tainted 2.6.37-0.rc8.git3.1.fc15.x86_64 #1
[    0.008999] Call Trace:
[    0.008999]  [<ffffffff810505d7>] warn_slowpath_common+0x85/0x9d
[    0.008999]  [<ffffffff81050609>] warn_slowpath_null+0x1a/0x1c
[    0.008999]  [<ffffffff812abfea>] find_unbound_irq+0x88/0x9f
[    0.008999]  [<ffffffff812ac90e>] bind_ipi_to_irqhandler+0x64/0x153
[    0.008999]  [<ffffffff81007979>] ? xen_reschedule_interrupt+0x0/0x18
[    0.008999]  [<ffffffff81234511>] ? kasprintf+0x38/0x3b
[    0.008999]  [<ffffffff81007b92>] xen_smp_intr_init+0x46/0x1f3
[    0.008999]  [<ffffffff81b5839a>] xen_smp_prepare_cpus+0x3d/0x107
[    0.008999]  [<ffffffff81b53cf3>] kernel_init+0x92/0x2b6
[    0.008999]  [<ffffffff8100bae4>] kernel_thread_helper+0x4/0x10
[    0.008999]  [<ffffffff8100aee3>] ? int_ret_from_sys_call+0x7/0x1b
[    0.008999]  [<ffffffff81477edd>] ? retint_restore_args+0x5/0x6
[    0.008999]  [<ffffffff8100bae0>] ? kernel_thread_helper+0x0/0x10
[    0.008999] ---[ end trace a7919e7f17c0a725 ]---
[    0.008999] ------------[ cut here ]------------
[    0.008999] WARNING: at kernel/irq/manage.c:904 __free_irq+0xa3/0x1ab()
[    0.008999] Hardware name: Bochs
[    0.008999] Trying to free already-free IRQ 0
[    0.008999] Modules linked in:
[    0.008999] Pid: 1, comm: swapper Tainted: G        W   2.6.37-0.rc8.git3.1.fc15.x86_64 #1
[    0.008999] Call Trace:
[    0.008999]  [<ffffffff810505d7>] warn_slowpath_common+0x85/0x9d
[    0.008999]  [<ffffffff81050692>] warn_slowpath_fmt+0x46/0x48
[    0.008999]  [<ffffffff8107d246>] ? arch_local_irq_save+0x18/0x1e
[    0.008999]  [<ffffffff810ac901>] __free_irq+0xa3/0x1ab
[    0.008999]  [<ffffffff810aca41>] free_irq+0x38/0x50
[    0.008999]  [<ffffffff812abead>] unbind_from_irqhandler+0x15/0x20
[    0.008999]  [<ffffffff81007cce>] xen_smp_intr_init+0x182/0x1f3
[    0.008999]  [<ffffffff81b5839a>] xen_smp_prepare_cpus+0x3d/0x107
[    0.008999]  [<ffffffff81b53cf3>] kernel_init+0x92/0x2b6
[    0.008999]  [<ffffffff8100bae4>] kernel_thread_helper+0x4/0x10
[    0.008999]  [<ffffffff8100aee3>] ? int_ret_from_sys_call+0x7/0x1b
[    0.008999]  [<ffffffff81477edd>] ? retint_restore_args+0x5/0x6
[    0.008999]  [<ffffffff8100bae0>] ? kernel_thread_helper+0x0/0x10
[    0.008999] ---[ end trace a7919e7f17c0a726 ]---
[    0.008999] ------------[ cut here ]------------
[    0.008999] WARNING: at kernel/irq/manage.c:904 __free_irq+0xa3/0x1ab()
[    0.008999] Hardware name: Bochs
[    0.008999] Trying to free already-free IRQ 0
[    0.008999] Modules linked in:
[    0.008999] Pid: 1, comm: swapper Tainted: G        W   2.6.37-0.rc8.git3.1.fc15.x86_64 #1
[    0.008999] Call Trace:
[    0.008999]  [<ffffffff810505d7>] warn_slowpath_common+0x85/0x9d
[    0.008999]  [<ffffffff81050692>] warn_slowpath_fmt+0x46/0x48
[    0.008999]  [<ffffffff8107d246>] ? arch_local_irq_save+0x18/0x1e
[    0.008999]  [<ffffffff810ac901>] __free_irq+0xa3/0x1ab
[    0.008999]  [<ffffffff810aca41>] free_irq+0x38/0x50
[    0.008999]  [<ffffffff812abead>] unbind_from_irqhandler+0x15/0x20
[    0.008999]  [<ffffffff81007cf0>] xen_smp_intr_init+0x1a4/0x1f3
[    0.008999]  [<ffffffff81b5839a>] xen_smp_prepare_cpus+0x3d/0x107
[    0.008999]  [<ffffffff81b53cf3>] kernel_init+0x92/0x2b6
[    0.008999]  [<ffffffff8100bae4>] kernel_thread_helper+0x4/0x10
[    0.008999]  [<ffffffff8100aee3>] ? int_ret_from_sys_call+0x7/0x1b
[    0.008999]  [<ffffffff81477edd>] ? retint_restore_args+0x5/0x6
[    0.008999]  [<ffffffff8100bae0>] ? kernel_thread_helper+0x0/0x10
[    0.008999] ---[ end trace a7919e7f17c0a727 ]---
[    0.008999] ------------[ cut here ]------------
[    0.008999] WARNING: at kernel/irq/manage.c:904 __free_irq+0xa3/0x1ab()
[    0.008999] Hardware name: Bochs
[    0.008999] Trying to free already-free IRQ 0
[    0.008999] Modules linked in:
[    0.008999] Pid: 1, comm: swapper Tainted: G        W   2.6.37-0.rc8.git3.1.fc15.x86_64 #1
[    0.008999] Call Trace:
[    0.008999]  [<ffffffff810505d7>] warn_slowpath_common+0x85/0x9d
[    0.008999]  [<ffffffff81050692>] warn_slowpath_fmt+0x46/0x48
[    0.008999]  [<ffffffff8107d246>] ? arch_local_irq_save+0x18/0x1e
[    0.008999]  [<ffffffff810ac901>] __free_irq+0xa3/0x1ab
[    0.008999]  [<ffffffff810aca41>] free_irq+0x38/0x50
[    0.008999]  [<ffffffff812abead>] unbind_from_irqhandler+0x15/0x20
[    0.008999]  [<ffffffff81007d34>] xen_smp_intr_init+0x1e8/0x1f3
[    0.008999]  [<ffffffff81b5839a>] xen_smp_prepare_cpus+0x3d/0x107
[    0.008999]  [<ffffffff81b53cf3>] kernel_init+0x92/0x2b6
[    0.008999]  [<ffffffff8100bae4>] kernel_thread_helper+0x4/0x10
[    0.008999]  [<ffffffff8100aee3>] ? int_ret_from_sys_call+0x7/0x1b
[    0.008999]  [<ffffffff81477edd>] ? retint_restore_args+0x5/0x6
[    0.008999]  [<ffffffff8100bae0>] ? kernel_thread_helper+0x0/0x10
[    0.008999] ---[ end trace a7919e7f17c0a728 ]---
[    0.009018] ------------[ cut here ]------------
[    0.009999] kernel BUG at arch/x86/xen/smp.c:217!
[    0.009999] invalid opcode: 0000 [#1] SMP 
[    0.009999] last sysfs file: 
[    0.009999] CPU 0 
[    0.009999] Modules linked in:
[    0.009999] 
[    0.009999] Pid: 1, comm: swapper Tainted: G        W   2.6.37-0.rc8.git3.1.fc15.x86_64 #1 /Bochs
[    0.009999] RIP: e030:[<ffffffff81b5839e>]  [<ffffffff81b5839e>] xen_smp_prepare_cpus+0x41/0x107
[    0.009999] RSP: e02b:ffff880033841eb0  EFLAGS: 00010286
[    0.009999] RAX: 00000000ffffffff RBX: ffffffff81c1c7b0 RCX: 0000000000000100
[    0.009999] RDX: ffff88003a410000 RSI: 0000000000000000 RDI: ffffffff81d64d50
[    0.009999] RBP: ffff880033841ed0 R08: 0000000000000002 R09: 00000000fffffffe
[    0.009999] R10: ffff880033841e50 R11: 0000000000000000 R12: 0000000000000100
[    0.009999] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[    0.009999] FS:  0000000000000000(0000) GS:ffff88003b063000(0000) knlGS:0000000000000000
[    0.009999] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
[    0.009999] CR2: 0000000000000000 CR3: 0000000001a03000 CR4: 0000000000000660
[    0.009999] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.009999] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[    0.009999] Process swapper (pid: 1, threadinfo ffff880033840000, task ffff880033838000)
[    0.009999] Stack:
[    0.009999]  ffff880033838000 ffffffff81c1c7b0 0000000000000000 0000000000000000
[    0.009999]  ffff880033841f40 ffffffff81b53cf3 0000000000000001 0000000000000000
[    0.009999]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[    0.009999] Call Trace:
[    0.009999]  [<ffffffff81b53cf3>] kernel_init+0x92/0x2b6
[    0.009999]  [<ffffffff8100bae4>] kernel_thread_helper+0x4/0x10
[    0.009999]  [<ffffffff8100aee3>] ? int_ret_from_sys_call+0x7/0x1b
[    0.009999]  [<ffffffff81477edd>] ? retint_restore_args+0x5/0x6
[    0.009999]  [<ffffffff8100bae0>] ? kernel_thread_helper+0x0/0x10
[    0.009999] Code: ff 48 8b 15 25 b9 fd ff 31 ff 48 c7 c0 00 36 01 00 66 c7 84 10 c0 00 00 00 01 00 e8 3c 76 91 ff 31 ff e8 b2 f7 4a ff 85 c0 74 02 <0f> 0b 31 ff e8 a9 f5 4a ff 48 c7 c2 00 20 c3 81 b9 08 00 00 00 
[    0.009999] RIP  [<ffffffff81b5839e>] xen_smp_prepare_cpus+0x41/0x107
[    0.009999]  RSP <ffff880033841eb0>
[    0.009999] ---[ end trace a7919e7f17c0a729 ]---
[    0.010021] Kernel panic - not syncing: Attempted to kill init!
[    0.010999] Pid: 1, comm: swapper Tainted: G      D W   2.6.37-0.rc8.git3.1.fc15.x86_64 #1
[    0.010999] Call Trace:
[    0.010999]  [<ffffffff814759d5>] panic+0x91/0x1a4
[    0.010999]  [<ffffffff810d6093>] ? perf_event_exit_task+0xb8/0x1c7
[    0.010999]  [<ffffffff81053b89>] do_exit+0x7c/0x75d
[    0.010999]  [<ffffffff8107d21f>] ? arch_local_irq_restore+0xb/0xd
[    0.010999]  [<ffffffff8147795f>] ? _raw_spin_unlock_irqrestore+0x17/0x19
[    0.010999]  [<ffffffff8100022a>] ? _stext+0x9a/0xe70
[    0.010999]  [<ffffffff81478c8b>] oops_end+0xbf/0xc7
[    0.010999]  [<ffffffff8100022a>] ? _stext+0x9a/0xe70
[    0.010999]  [<ffffffff8100022a>] ? _stext+0x9a/0xe70
[    0.010999]  [<ffffffff8100e6ec>] die+0x5a/0x66
[    0.010999]  [<ffffffff81478518>] do_trap+0x121/0x130
[    0.010999]  [<ffffffff8100c06d>] do_invalid_op+0x98/0xa1
[    0.010999]  [<ffffffff81b5839e>] ? xen_smp_prepare_cpus+0x41/0x107
[    0.010999]  [<ffffffff8107d246>] ? arch_local_irq_save+0x18/0x1e
[    0.010999]  [<ffffffff8107d21f>] ? arch_local_irq_restore+0xb/0xd
[    0.010999]  [<ffffffff8147795f>] ? _raw_spin_unlock_irqrestore+0x17/0x19
[    0.010999]  [<ffffffff810ac90d>] ? __free_irq+0xaf/0x1ab
[    0.010999]  [<ffffffff8100b95b>] invalid_op+0x1b/0x20
[    0.010999]  [<ffffffff81b5839e>] ? xen_smp_prepare_cpus+0x41/0x107
[    0.010999]  [<ffffffff81b53cf3>] kernel_init+0x92/0x2b6
[    0.010999]  [<ffffffff8100bae4>] kernel_thread_helper+0x4/0x10
[    0.010999]  [<ffffffff8100aee3>] ? int_ret_from_sys_call+0x7/0x1b
[    0.010999]  [<ffffffff81477edd>] ? retint_restore_args+0x5/0x6
[    0.010999]  [<ffffffff8100bae0>] ? kernel_thread_helper+0x0/0x10
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2011-01-25 16:05 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-07 20:34 Crash on boot with 2.6.37-rc8-git3 M A Young
2011-01-07 21:23 ` Konrad Rzeszutek Wilk
2011-01-08  0:10   ` M A Young
2011-01-10 18:42     ` Konrad Rzeszutek Wilk
2011-01-10 21:43       ` M A Young
2011-01-16 20:48       ` M A Young
2011-01-16 20:56         ` Keir Fraser
2011-01-18  0:52       ` M A Young
2011-01-19 22:54         ` M A Young
2011-01-20 19:24           ` Konrad Rzeszutek Wilk
2011-01-20 22:39             ` M A Young
2011-01-21 15:27               ` Konrad Rzeszutek Wilk
2011-01-21 21:43                 ` M A Young
2011-01-24 14:14                   ` Konrad Rzeszutek Wilk
2011-01-24 23:12                     ` M A Young
2011-01-25 12:03                       ` Stefano Stabellini
2011-01-25 13:24                         ` Ian Campbell
2011-01-25 13:31                           ` Stefano Stabellini
2011-01-25 13:45                             ` Ian Campbell
2011-01-25 15:19                               ` Stefano Stabellini
2011-01-25 15:52                                 ` Konrad Rzeszutek Wilk
2011-01-25 15:56                                   ` Stefano Stabellini
2011-01-25 16:05                                     ` M A Young
2011-01-24 19:04                   ` Stefano Stabellini
2011-01-25  0:22                     ` M A Young
  -- strict thread matches above, loose matches on Subject: below --
2011-01-04 22:01 M A Young
2011-01-05 15:43 ` Konrad Rzeszutek Wilk
2011-01-05 23:11   ` M A Young
2011-01-06 14:56     ` Konrad Rzeszutek Wilk
2011-01-07  0:37       ` M A Young
2011-01-07 19:18         ` Konrad Rzeszutek Wilk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.