All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
       [not found] ` <d556fc0f-da5d-4531-b331-6dc086461f34@blur>
@ 2012-10-21  0:17   ` Tom Rini
  2012-10-21  4:01     ` Yinghai Lu
  0 siblings, 1 reply; 32+ messages in thread
From: Tom Rini @ 2012-10-21  0:17 UTC (permalink / raw)
  To: Shin, Jacob; +Cc: linux-kernel, H. Peter Anvin, stable

On 10/20/12 17:11, Shin, Jacob wrote:
> Hi could you please attach the dmesg output? Before rc2 is fine as well.
> I would like to see the E820 table. Thank you,

dmesg is quite long so I've put it on pastebin: http://pastebin.com/4eSPEAvB

-- 
Tom

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-21  0:17   ` BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot Tom Rini
@ 2012-10-21  4:01     ` Yinghai Lu
  2012-10-21  4:18       ` Jacob Shin
  2012-10-21 17:52       ` Tom Rini
  0 siblings, 2 replies; 32+ messages in thread
From: Yinghai Lu @ 2012-10-21  4:01 UTC (permalink / raw)
  To: Tom Rini; +Cc: Shin, Jacob, linux-kernel, H. Peter Anvin, stable

On Sat, Oct 20, 2012 at 5:17 PM, Tom Rini <trini@ti.com> wrote:
> On 10/20/12 17:11, Shin, Jacob wrote:
>> Hi could you please attach the dmesg output? Before rc2 is fine as well.
>> I would like to see the E820 table. Thank you,
>
> dmesg is quite long so I've put it on pastebin: http://pastebin.com/4eSPEAvB
>
> --

[    0.000000] BIOS-e820: [mem 0x0000000100001000-0x000000042fffffff] usable

pre-calculate table size is too small, so it crashes.

can you please try

	git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git
for-x86-mm

and post bootlog?

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-21  4:01     ` Yinghai Lu
@ 2012-10-21  4:18       ` Jacob Shin
  2012-10-21 17:51         ` Tom Rini
  2012-10-21 17:52       ` Tom Rini
  1 sibling, 1 reply; 32+ messages in thread
From: Jacob Shin @ 2012-10-21  4:18 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Tom Rini, linux-kernel, H. Peter Anvin, stable

On Sat, Oct 20, 2012 at 09:01:43PM -0700, Yinghai Lu wrote:
> On Sat, Oct 20, 2012 at 5:17 PM, Tom Rini <trini@ti.com> wrote:
> > On 10/20/12 17:11, Shin, Jacob wrote:
> >> Hi could you please attach the dmesg output? Before rc2 is fine as well.
> >> I would like to see the E820 table. Thank you,
> >
> > dmesg is quite long so I've put it on pastebin: http://pastebin.com/4eSPEAvB
> >
> > --
> 
> [    0.000000] BIOS-e820: [mem 0x0000000100001000-0x000000042fffffff] usable
> 
> pre-calculate table size is too small, so it crashes.

Right,

I think just this one patch 3/6 on top of -rc2 should work:

https://lkml.org/lkml/2012/8/29/223

That would be a simpler path for 3.7,

Thanks!

> 
> can you please try
> 
> 	git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git
> for-x86-mm
> 
> and post bootlog?
> 
> Thanks
> 
> Yinghai
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-21  4:18       ` Jacob Shin
@ 2012-10-21 17:51         ` Tom Rini
  2012-10-21 21:06           ` Jacob Shin
  0 siblings, 1 reply; 32+ messages in thread
From: Tom Rini @ 2012-10-21 17:51 UTC (permalink / raw)
  To: Jacob Shin; +Cc: Yinghai Lu, linux-kernel, H. Peter Anvin, stable

On 10/20/12 21:18, Jacob Shin wrote:
> On Sat, Oct 20, 2012 at 09:01:43PM -0700, Yinghai Lu wrote:
>> On Sat, Oct 20, 2012 at 5:17 PM, Tom Rini <trini@ti.com> wrote:
>>> On 10/20/12 17:11, Shin, Jacob wrote:
>>>> Hi could you please attach the dmesg output? Before rc2 is fine as well.
>>>> I would like to see the E820 table. Thank you,
>>>
>>> dmesg is quite long so I've put it on pastebin: http://pastebin.com/4eSPEAvB
>>>
>>> --
>>
>> [    0.000000] BIOS-e820: [mem 0x0000000100001000-0x000000042fffffff] usable
>>
>> pre-calculate table size is too small, so it crashes.
> 
> Right,
> 
> I think just this one patch 3/6 on top of -rc2 should work:
> 
> https://lkml.org/lkml/2012/8/29/223
> 
> That would be a simpler path for 3.7,

It doesn't apply easily (for me) on top of 3.7-rc2 however.  Happy to
test a patch on top of 3.7-rc2 when you're able to.

-- 
Tom

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-21  4:01     ` Yinghai Lu
  2012-10-21  4:18       ` Jacob Shin
@ 2012-10-21 17:52       ` Tom Rini
  1 sibling, 0 replies; 32+ messages in thread
From: Tom Rini @ 2012-10-21 17:52 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Shin, Jacob, linux-kernel, H. Peter Anvin, stable

On 10/20/12 21:01, Yinghai Lu wrote:
> On Sat, Oct 20, 2012 at 5:17 PM, Tom Rini <trini@ti.com> wrote:
>> On 10/20/12 17:11, Shin, Jacob wrote:
>>> Hi could you please attach the dmesg output? Before rc2 is fine as well.
>>> I would like to see the E820 table. Thank you,
>>
>> dmesg is quite long so I've put it on pastebin: http://pastebin.com/4eSPEAvB
>>
>> --
> 
> [    0.000000] BIOS-e820: [mem 0x0000000100001000-0x000000042fffffff] usable
> 
> pre-calculate table size is too small, so it crashes.
> 
> can you please try
> 
> 	git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git
> for-x86-mm
> 
> and post bootlog?

This boots but I'm bisecting another failure later on and can't post the
boot log (just finished bisecting that issue now).

-- 
Tom

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-21 17:51         ` Tom Rini
@ 2012-10-21 21:06           ` Jacob Shin
  2012-10-21 21:23             ` Tom Rini
  0 siblings, 1 reply; 32+ messages in thread
From: Jacob Shin @ 2012-10-21 21:06 UTC (permalink / raw)
  To: Tom Rini; +Cc: Yinghai Lu, linux-kernel, H. Peter Anvin, stable

On Sun, Oct 21, 2012 at 10:51:35AM -0700, Tom Rini wrote:
> On 10/20/12 21:18, Jacob Shin wrote:
> > On Sat, Oct 20, 2012 at 09:01:43PM -0700, Yinghai Lu wrote:
> >> On Sat, Oct 20, 2012 at 5:17 PM, Tom Rini <trini@ti.com> wrote:
> >>> On 10/20/12 17:11, Shin, Jacob wrote:
> >>>> Hi could you please attach the dmesg output? Before rc2 is fine as well.
> >>>> I would like to see the E820 table. Thank you,
> >>>
> >>> dmesg is quite long so I've put it on pastebin: http://pastebin.com/4eSPEAvB
> >>>
> >>> --
> >>
> >> [    0.000000] BIOS-e820: [mem 0x0000000100001000-0x000000042fffffff] usable
> >>
> >> pre-calculate table size is too small, so it crashes.
> > 
> > Right,
> > 
> > I think just this one patch 3/6 on top of -rc2 should work:
> > 
> > https://lkml.org/lkml/2012/8/29/223
> > 
> > That would be a simpler path for 3.7,
> 
> It doesn't apply easily (for me) on top of 3.7-rc2 however.  Happy to
> test a patch on top of 3.7-rc2 when you're able to.

Ah, sorry, this one should apply on top of 3.7-rc2:

https://lkml.org/lkml/2012/8/24/469

Could you try that? Just that single patch, not the whole patchset.

Thanks!

-Jacob

> 
> -- 
> Tom
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-21 21:06           ` Jacob Shin
@ 2012-10-21 21:23             ` Tom Rini
  2012-10-22 14:40               ` Jacob Shin
  0 siblings, 1 reply; 32+ messages in thread
From: Tom Rini @ 2012-10-21 21:23 UTC (permalink / raw)
  To: Jacob Shin; +Cc: Yinghai Lu, linux-kernel, H. Peter Anvin, stable

On 10/21/12 14:06, Jacob Shin wrote:
> On Sun, Oct 21, 2012 at 10:51:35AM -0700, Tom Rini wrote:
>> On 10/20/12 21:18, Jacob Shin wrote:
>>> On Sat, Oct 20, 2012 at 09:01:43PM -0700, Yinghai Lu wrote:
>>>> On Sat, Oct 20, 2012 at 5:17 PM, Tom Rini <trini@ti.com> wrote:
>>>>> On 10/20/12 17:11, Shin, Jacob wrote:
>>>>>> Hi could you please attach the dmesg output? Before rc2 is fine as well.
>>>>>> I would like to see the E820 table. Thank you,
>>>>>
>>>>> dmesg is quite long so I've put it on pastebin: http://pastebin.com/4eSPEAvB
>>>>>
>>>>> --
>>>>
>>>> [    0.000000] BIOS-e820: [mem 0x0000000100001000-0x000000042fffffff] usable
>>>>
>>>> pre-calculate table size is too small, so it crashes.
>>>
>>> Right,
>>>
>>> I think just this one patch 3/6 on top of -rc2 should work:
>>>
>>> https://lkml.org/lkml/2012/8/29/223
>>>
>>> That would be a simpler path for 3.7,
>>
>> It doesn't apply easily (for me) on top of 3.7-rc2 however.  Happy to
>> test a patch on top of 3.7-rc2 when you're able to.
> 
> Ah, sorry, this one should apply on top of 3.7-rc2:
> 
> https://lkml.org/lkml/2012/8/24/469
> 
> Could you try that? Just that single patch, not the whole patchset.

That fixes it, replied with a note and Tested-by, thanks!

-- 
Tom


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-21 21:23             ` Tom Rini
@ 2012-10-22 14:40               ` Jacob Shin
  2012-10-22 18:05                 ` Yinghai Lu
  2012-10-28 20:48                 ` Tom Rini
  0 siblings, 2 replies; 32+ messages in thread
From: Jacob Shin @ 2012-10-22 14:40 UTC (permalink / raw)
  To: Tom Rini, hpa; +Cc: Yinghai Lu, linux-kernel, H. Peter Anvin, stable

On Sun, Oct 21, 2012 at 02:23:58PM -0700, Tom Rini wrote:
> On 10/21/12 14:06, Jacob Shin wrote:
> > Ah, sorry, this one should apply on top of 3.7-rc2:
> > 
> > https://lkml.org/lkml/2012/8/24/469
> > 
> > Could you try that? Just that single patch, not the whole patchset.
> 
> That fixes it, replied with a note and Tested-by, thanks!

Thanks for testing!

hpa, so sorry, but it looks like we need one more patch [PATCH 2/5] x86:
find_early_table_space based on memory ranges that are being mapped:

  https://lkml.org/lkml/2012/8/24/469

on top of this, because find_early_table_space calculation does not come out
correctly for this particular E820 table that Tom has:

  http://pastebin.com/4eSPEAvB

The reason why we hit this now, and never hit it before is because before the
start was hard coded to 1UL<<32.

Thanks,

-Jacob

> 
> -- 
> Tom
> 
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-22 14:40               ` Jacob Shin
@ 2012-10-22 18:05                 ` Yinghai Lu
  2012-10-22 18:38                   ` Jacob Shin
  2012-10-28 20:48                 ` Tom Rini
  1 sibling, 1 reply; 32+ messages in thread
From: Yinghai Lu @ 2012-10-22 18:05 UTC (permalink / raw)
  To: Jacob Shin; +Cc: Tom Rini, hpa, linux-kernel, H. Peter Anvin, stable

On Mon, Oct 22, 2012 at 7:40 AM, Jacob Shin <jacob.shin@amd.com> wrote:
> On Sun, Oct 21, 2012 at 02:23:58PM -0700, Tom Rini wrote:
>> On 10/21/12 14:06, Jacob Shin wrote:
>> > Ah, sorry, this one should apply on top of 3.7-rc2:
>> >
>> > https://lkml.org/lkml/2012/8/24/469
>> >
>> > Could you try that? Just that single patch, not the whole patchset.
>>
>> That fixes it, replied with a note and Tested-by, thanks!
>
> Thanks for testing!
>
> hpa, so sorry, but it looks like we need one more patch [PATCH 2/5] x86:
> find_early_table_space based on memory ranges that are being mapped:
>
>   https://lkml.org/lkml/2012/8/24/469
>
> on top of this, because find_early_table_space calculation does not come out
> correctly for this particular E820 table that Tom has:
>
>   http://pastebin.com/4eSPEAvB
>
> The reason why we hit this now, and never hit it before is because before the
> start was hard coded to 1UL<<32.
>

I'm afraid that  we may need add more patches to make v3.7 really
handle every corner case.

During testing, I found more problem:
1. E820_RAM and E820_RESEVED_KERN
   EFI change some E820_RAM to E820_RESREVED_KERN to cover
   efi setup_data. and will pass to e820_saved, to next kexec-ed kernel.
  So we can use E820_RAM to loop it, and should still E820_RAM and
E820_RESERVED_KERN combined.
  otherwise will render page table with small pages, or every some partial
  is not covered.
  So i change to for_each_mem_pfn_range(), we fill the memblock with
  E820_RAM and E820_RESERVED_KERN, and memblock will merge
  range together, that will make mapping still use big page size.

2. partial page:
   E820 or user could pass memmap that  is not page aligned.
   old cold will guarded by max_low_pfn and max_pfn. so the end partial
   page will be trimmed down, and memblock can one use it.
   middle partial page will still get covered by directly mapping, and
memblock still can use them.
   Now we will not map middle partial page and memblock still try to use it
we could get panic when accessing those pages.

So I would suggest to just revert that temporary patch at this time,
and later come out one complete patch for stable kernels.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-22 18:05                 ` Yinghai Lu
@ 2012-10-22 18:38                   ` Jacob Shin
  2012-10-22 19:46                     ` Yinghai Lu
  0 siblings, 1 reply; 32+ messages in thread
From: Jacob Shin @ 2012-10-22 18:38 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Tom Rini, hpa, linux-kernel, H. Peter Anvin, stable

On Mon, Oct 22, 2012 at 11:05:29AM -0700, Yinghai Lu wrote:
> On Mon, Oct 22, 2012 at 7:40 AM, Jacob Shin <jacob.shin@amd.com> wrote:
> > On Sun, Oct 21, 2012 at 02:23:58PM -0700, Tom Rini wrote:
> >> On 10/21/12 14:06, Jacob Shin wrote:
> >> > Ah, sorry, this one should apply on top of 3.7-rc2:
> >> >
> >> > https://lkml.org/lkml/2012/8/24/469
> >> >
> >> > Could you try that? Just that single patch, not the whole patchset.
> >>
> >> That fixes it, replied with a note and Tested-by, thanks!
> >
> > Thanks for testing!
> >
> > hpa, so sorry, but it looks like we need one more patch [PATCH 2/5] x86:
> > find_early_table_space based on memory ranges that are being mapped:
> >
> >   https://lkml.org/lkml/2012/8/24/469
> >
> > on top of this, because find_early_table_space calculation does not come out
> > correctly for this particular E820 table that Tom has:
> >
> >   http://pastebin.com/4eSPEAvB
> >
> > The reason why we hit this now, and never hit it before is because before the
> > start was hard coded to 1UL<<32.
> >
> 
> I'm afraid that  we may need add more patches to make v3.7 really
> handle every corner case.
> 
> During testing, I found more problem:
> 1. E820_RAM and E820_RESEVED_KERN
>    EFI change some E820_RAM to E820_RESREVED_KERN to cover
>    efi setup_data. and will pass to e820_saved, to next kexec-ed kernel.
>   So we can use E820_RAM to loop it, and should still E820_RAM and
> E820_RESERVED_KERN combined.
>   otherwise will render page table with small pages, or every some partial
>   is not covered.
>   So i change to for_each_mem_pfn_range(), we fill the memblock with
>   E820_RAM and E820_RESERVED_KERN, and memblock will merge
>   range together, that will make mapping still use big page size.

Does EFI do this on above 4G memory? All the EFI BIOSes we have in house looked
to be only touching under 4G.

> 
> 2. partial page:
>    E820 or user could pass memmap that  is not page aligned.
>    old cold will guarded by max_low_pfn and max_pfn. so the end partial
>    page will be trimmed down, and memblock can one use it.
>    middle partial page will still get covered by directly mapping, and
> memblock still can use them.
>    Now we will not map middle partial page and memblock still try to use it
> we could get panic when accessing those pages.
> 
> So I would suggest to just revert that temporary patch at this time,
> and later come out one complete patch for stable kernels.

Hm okay, I was hoping not, but if it has to be ..

> 
> Thanks
> 
> Yinghai
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-22 18:38                   ` Jacob Shin
@ 2012-10-22 19:46                     ` Yinghai Lu
  2012-10-22 20:26                       ` H. Peter Anvin
  0 siblings, 1 reply; 32+ messages in thread
From: Yinghai Lu @ 2012-10-22 19:46 UTC (permalink / raw)
  To: Jacob Shin; +Cc: Tom Rini, hpa, linux-kernel, H. Peter Anvin, stable

On Mon, Oct 22, 2012 at 11:38 AM, Jacob Shin <jacob.shin@amd.com> wrote:
>
> Does EFI do this on above 4G memory? All the EFI BIOSes we have in house looked
> to be only touching under 4G.

I have no idea about it.

>
>>
>> 2. partial page:
>>    E820 or user could pass memmap that  is not page aligned.
>>    old cold will guarded by max_low_pfn and max_pfn. so the end partial
>>    page will be trimmed down, and memblock can one use it.
>>    middle partial page will still get covered by directly mapping, and
>> memblock still can use them.
>>    Now we will not map middle partial page and memblock still try to use it
>> we could get panic when accessing those pages.
>>
>> So I would suggest to just revert that temporary patch at this time,
>> and later come out one complete patch for stable kernels.
>
> Hm okay, I was hoping not, but if it has to be ..

It's hpa's call.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-22 19:46                     ` Yinghai Lu
@ 2012-10-22 20:26                       ` H. Peter Anvin
  2012-10-22 20:50                         ` Yinghai Lu
  0 siblings, 1 reply; 32+ messages in thread
From: H. Peter Anvin @ 2012-10-22 20:26 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Jacob Shin, Tom Rini, hpa, linux-kernel, stable

On 10/22/2012 12:46 PM, Yinghai Lu wrote:
> On Mon, Oct 22, 2012 at 11:38 AM, Jacob Shin <jacob.shin@amd.com> wrote:
>>
>> Does EFI do this on above 4G memory? All the EFI BIOSes we have in house looked
>> to be only touching under 4G.
> 
> I have no idea about it.
> 

I don't think we can rely on what is happening right now anyway.

>>> 2. partial page:
>>>    E820 or user could pass memmap that  is not page aligned.
>>>    old cold will guarded by max_low_pfn and max_pfn. so the end partial
>>>    page will be trimmed down, and memblock can one use it.
>>>    middle partial page will still get covered by directly mapping, and
>>> memblock still can use them.
>>>    Now we will not map middle partial page and memblock still try to use it
>>> we could get panic when accessing those pages.
>>>
>>> So I would suggest to just revert that temporary patch at this time,
>>> and later come out one complete patch for stable kernels.
>>
>> Hm okay, I was hoping not, but if it has to be ..
> 
> It's hpa's call.

So the issue is that two E820 RAM ranges (or ACPI, or kernel-reserved)
are immediately adjacent on a non-page-aligned address?  Or is there a
gap in between and memblock is still expecting to use it?

We should not map a partial page at the end of RAM; it is functionally
lost.  Two immediately adjacent pages could be coalesced, but not a
partial page that abuts I/O space (and yes, such abortions can happen in
the real world.)

However, the issue obviously is that what we can realistically put in
3.7 or stable is limited at this point.

	-hpa




^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-22 20:26                       ` H. Peter Anvin
@ 2012-10-22 20:50                         ` Yinghai Lu
  2012-10-22 20:52                           ` H. Peter Anvin
  2012-10-22 21:00                           ` BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot H. Peter Anvin
  0 siblings, 2 replies; 32+ messages in thread
From: Yinghai Lu @ 2012-10-22 20:50 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Jacob Shin, Tom Rini, hpa, linux-kernel, stable

On Mon, Oct 22, 2012 at 1:26 PM, H. Peter Anvin <hpa@linux.intel.com> wrote:
>>>> 2. partial page:
>>>>    E820 or user could pass memmap that  is not page aligned.
>>>>    old cold will guarded by max_low_pfn and max_pfn. so the end partial
>>>>    page will be trimmed down, and memblock can one use it.
>>>>    middle partial page will still get covered by directly mapping, and
>>>> memblock still can use them.
>>>>    Now we will not map middle partial page and memblock still try to use it
>>>> we could get panic when accessing those pages.
>>>>
>>>> So I would suggest to just revert that temporary patch at this time,
>>>> and later come out one complete patch for stable kernels.
>>>
>>> Hm okay, I was hoping not, but if it has to be ..
>>
>> It's hpa's call.
>
> So the issue is that two E820 RAM ranges (or ACPI, or kernel-reserved)
> are immediately adjacent on a non-page-aligned address?

yes. or the user take out range that is not page aligned.

> Or is there a
> gap in between and memblock is still expecting to use it?

yes, current implementation is.  and init_memory_mapping map those partial pages
and holes.

>
> We should not map a partial page at the end of RAM; it is functionally
> lost.

Now we did not, we have max_low_pfn, and max_pfn to cap out end partial page.

> Two immediately adjacent pages could be coalesced, but not a
> partial page that abuts I/O space (and yes, such abortions can happen in
> the real world.)
>
> However, the issue obviously is that what we can realistically put in
> 3.7 or stable is limited at this point.

ok, let's see if we can meet this extreme corner case except user
specify not page aligned "memmap="

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-22 20:50                         ` Yinghai Lu
@ 2012-10-22 20:52                           ` H. Peter Anvin
  2012-10-22 21:25                             ` Yinghai Lu
  2012-10-22 21:00                           ` BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot H. Peter Anvin
  1 sibling, 1 reply; 32+ messages in thread
From: H. Peter Anvin @ 2012-10-22 20:52 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Jacob Shin, Tom Rini, hpa, linux-kernel, stable

On 10/22/2012 01:50 PM, Yinghai Lu wrote:
> ok, let's see if we can meet this extreme corner case except user
> specify not page aligned "memmap="

If it is *only* memmap= there is a very simple solution: if the memmap
is RAM then we round up the starting address and round down the end
address; if the memmap is not RAM then we round up instead...

	-hpa


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-22 20:50                         ` Yinghai Lu
  2012-10-22 20:52                           ` H. Peter Anvin
@ 2012-10-22 21:00                           ` H. Peter Anvin
  2012-10-22 21:06                             ` Yinghai Lu
  1 sibling, 1 reply; 32+ messages in thread
From: H. Peter Anvin @ 2012-10-22 21:00 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: H. Peter Anvin, Jacob Shin, Tom Rini, linux-kernel, stable

On 10/22/2012 01:50 PM, Yinghai Lu wrote:
>>
>> We should not map a partial page at the end of RAM; it is functionally
>> lost.
> 
> Now we did not, we have max_low_pfn, and max_pfn to cap out end partial page.
> 

Well, it is not just end of RAM, which is where the entire current
implementation falls apart, obviously.

	-hpa


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-22 21:00                           ` BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot H. Peter Anvin
@ 2012-10-22 21:06                             ` Yinghai Lu
  0 siblings, 0 replies; 32+ messages in thread
From: Yinghai Lu @ 2012-10-22 21:06 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: H. Peter Anvin, Jacob Shin, Tom Rini, linux-kernel, stable

On Mon, Oct 22, 2012 at 2:00 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 10/22/2012 01:50 PM, Yinghai Lu wrote:
>>>
>>> We should not map a partial page at the end of RAM; it is functionally
>>> lost.
>>
>> Now we did not, we have max_low_pfn, and max_pfn to cap out end partial page.
>>
>
> Well, it is not just end of RAM, which is where the entire current
> implementation falls apart, obviously.
>

ok, I will fix that from memblock_x86_fill().

after we put the E820_RAM  and E820_RESERVED_KERN in to memblock, do
one trim in memblock.memory.

Thanks

Yinghai Lu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-22 20:52                           ` H. Peter Anvin
@ 2012-10-22 21:25                             ` Yinghai Lu
  2012-10-22 21:27                               ` H. Peter Anvin
  0 siblings, 1 reply; 32+ messages in thread
From: Yinghai Lu @ 2012-10-22 21:25 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Jacob Shin, Tom Rini, hpa, linux-kernel, stable

On Mon, Oct 22, 2012 at 1:52 PM, H. Peter Anvin <hpa@linux.intel.com> wrote:
> On 10/22/2012 01:50 PM, Yinghai Lu wrote:
>> ok, let's see if we can meet this extreme corner case except user
>> specify not page aligned "memmap="
>
> If it is *only* memmap= there is a very simple solution: if the memmap
> is RAM then we round up the starting address and round down the end
> address; if the memmap is not RAM then we round up instead...

We never know that bios guys will not let bios produce crazy e820 map.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-22 21:25                             ` Yinghai Lu
@ 2012-10-22 21:27                               ` H. Peter Anvin
  2012-10-22 23:35                                 ` Yinghai Lu
  0 siblings, 1 reply; 32+ messages in thread
From: H. Peter Anvin @ 2012-10-22 21:27 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: H. Peter Anvin, Jacob Shin, Tom Rini, linux-kernel, stable

On 10/22/2012 02:25 PM, Yinghai Lu wrote:
> On Mon, Oct 22, 2012 at 1:52 PM, H. Peter Anvin <hpa@linux.intel.com> wrote:
>> On 10/22/2012 01:50 PM, Yinghai Lu wrote:
>>> ok, let's see if we can meet this extreme corner case except user
>>> specify not page aligned "memmap="
>>
>> If it is *only* memmap= there is a very simple solution: if the memmap
>> is RAM then we round up the starting address and round down the end
>> address; if the memmap is not RAM then we round up instead...
> 
> We never know that bios guys will not let bios produce crazy e820 map.
> 

Yeah, well, that just *will* happen... that's a given.

We can trim those ranges, though.  Who cares if we lose some RAM.

	-hpa



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-22 21:27                               ` H. Peter Anvin
@ 2012-10-22 23:35                                 ` Yinghai Lu
  2012-10-24 16:48                                   ` Jacob Shin
                                                     ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: Yinghai Lu @ 2012-10-22 23:35 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: H. Peter Anvin, Jacob Shin, Tom Rini, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 504 bytes --]

On Mon, Oct 22, 2012 at 2:27 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>>
>> We never know that bios guys will not let bios produce crazy e820 map.
>>
>
> Yeah, well, that just *will* happen... that's a given.
>
> We can trim those ranges, though.  Who cares if we lose some RAM.
>

please check attached two patches that handle partial pages for 3.7.

and you still need patch in
   https://lkml.org/lkml/2012/8/24/469

to address early page table size calculation problem for Tom Rini

Thanks

Yinghai

[-- Attachment #2: memblock_trim_memory.patch --]
[-- Type: application/octet-stream, Size: 2450 bytes --]

Subject: [PATCH] x86, mm: Trim memory in memblock to be page aligned

We will not map partial pages, so need to make sure memblock
allocation will not allocate those bytes out.

Also we will use for_each_mem_pfn_range() to loop to map memory
range to keep them consistent.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 arch/x86/kernel/e820.c   |    3 +++
 include/linux/memblock.h |    1 +
 mm/memblock.c            |   24 ++++++++++++++++++++++++
 3 files changed, 28 insertions(+)

Index: linux-2.6/arch/x86/kernel/e820.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/e820.c
+++ linux-2.6/arch/x86/kernel/e820.c
@@ -1077,6 +1077,9 @@ void __init memblock_x86_fill(void)
 		memblock_add(ei->addr, ei->size);
 	}
 
+	/* throw away partial pages */
+	memblock_trim_memory(PAGE_SIZE);
+
 	memblock_dump_all();
 }
 
Index: linux-2.6/include/linux/memblock.h
===================================================================
--- linux-2.6.orig/include/linux/memblock.h
+++ linux-2.6/include/linux/memblock.h
@@ -57,6 +57,7 @@ int memblock_add(phys_addr_t base, phys_
 int memblock_remove(phys_addr_t base, phys_addr_t size);
 int memblock_free(phys_addr_t base, phys_addr_t size);
 int memblock_reserve(phys_addr_t base, phys_addr_t size);
+void memblock_trim_memory(phys_addr_t align);
 
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
Index: linux-2.6/mm/memblock.c
===================================================================
--- linux-2.6.orig/mm/memblock.c
+++ linux-2.6/mm/memblock.c
@@ -930,6 +930,30 @@ int __init_memblock memblock_is_region_r
 	return memblock_overlaps_region(&memblock.reserved, base, size) >= 0;
 }
 
+void __init_memblock memblock_trim_memory(phys_addr_t align)
+{
+	int i;
+	phys_addr_t start, end, orig_start, orig_end;
+	struct memblock_type *mem = &memblock.memory;
+
+	for (i = 0; i < mem->cnt; i++) {
+		orig_start = mem->regions[i].base;
+		orig_end = mem->regions[i].base + mem->regions[i].size;
+		start = round_up(orig_start, align);
+		end = round_down(orig_end, align);
+
+		if (start == orig_start && end == orig_end)
+			continue;
+
+		if (start < end) {
+			mem->regions[i].base = start;
+			mem->regions[i].size = end - start;
+		} else {
+			memblock_remove_region(mem, i);
+			i--;
+		}
+	}
+}
 
 void __init_memblock memblock_set_current_limit(phys_addr_t limit)
 {

[-- Attachment #3: use_for_each_mem_pfn_range_setup.patch --]
[-- Type: application/octet-stream, Size: 1325 bytes --]

Subject: [PATCH] x86, mm: use memblock memory loop instead of e820_RAM

We need to handle E820_RAM and E820_RESERVED_KERNEL at the same time.

Also memblock has page aligned range for ram, so we could avoid mapping
partial pages.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 arch/x86/kernel/setup.c |   15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

Index: linux-2.6/arch/x86/kernel/setup.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/setup.c
+++ linux-2.6/arch/x86/kernel/setup.c
@@ -921,18 +921,19 @@ void __init setup_arch(char **cmdline_p)
 #ifdef CONFIG_X86_64
 	if (max_pfn > max_low_pfn) {
 		int i;
-		for (i = 0; i < e820.nr_map; i++) {
-			struct e820entry *ei = &e820.map[i];
+		unsigned long start, end;
+		unsigned long start_pfn, end_pfn;
 
-			if (ei->addr + ei->size <= 1UL << 32)
-				continue;
+		for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn,
+							 NULL) {
 
-			if (ei->type == E820_RESERVED)
+			end = PFN_PHYS(end_pfn);
+			if (end <= (1UL<<32))
 				continue;
 
+			start = PFN_PHYS(start_pfn);
 			max_pfn_mapped = init_memory_mapping(
-				ei->addr < 1UL << 32 ? 1UL << 32 : ei->addr,
-				ei->addr + ei->size);
+						max((1UL<<32), start), end);
 		}
 
 		/* can we preseve max_low_pfn ?*/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-22 23:35                                 ` Yinghai Lu
@ 2012-10-24 16:48                                   ` Jacob Shin
  2012-10-24 18:53                                     ` H. Peter Anvin
  2012-10-24 19:01                                   ` [tip:x86/urgent] x86, mm: Trim memory in memblock to be page aligned tip-bot for Yinghai Lu
  2012-10-24 19:02                                   ` [tip:x86/urgent] x86, mm: Use memblock memory loop instead of e820_RAM tip-bot for Yinghai Lu
  2 siblings, 1 reply; 32+ messages in thread
From: Jacob Shin @ 2012-10-24 16:48 UTC (permalink / raw)
  To: Yinghai Lu, H. Peter Anvin; +Cc: H. Peter Anvin, Tom Rini, linux-kernel

On Mon, Oct 22, 2012 at 04:35:18PM -0700, Yinghai Lu wrote:
> On Mon, Oct 22, 2012 at 2:27 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> >>
> >> We never know that bios guys will not let bios produce crazy e820 map.
> >>
> >
> > Yeah, well, that just *will* happen... that's a given.
> >
> > We can trim those ranges, though.  Who cares if we lose some RAM.
> >
> 
> please check attached two patches that handle partial pages for 3.7.
> 
> and you still need patch in
>    https://lkml.org/lkml/2012/8/24/469
> 
> to address early page table size calculation problem for Tom Rini

Acked-by: Jacob Shin <jacob.shin@amd.com>

hpa, we need this patch: https://lkml.org/lkml/2012/8/24/469 and the above
2 from Yinghai to handle corner case E820 layouts.

I got an email from Greg KH that 1bbbbe779aabe1f0768c2bf8f8c0a5583679b54a is
queued for stable, so these need to go to stable as well.

Thanks,

-Jacob

> 
> Thanks
> 
> Yinghai





^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-24 16:48                                   ` Jacob Shin
@ 2012-10-24 18:53                                     ` H. Peter Anvin
  2012-10-24 19:53                                       ` Jacob Shin
  0 siblings, 1 reply; 32+ messages in thread
From: H. Peter Anvin @ 2012-10-24 18:53 UTC (permalink / raw)
  To: Jacob Shin; +Cc: Yinghai Lu, H. Peter Anvin, Tom Rini, linux-kernel

On 10/24/2012 09:48 AM, Jacob Shin wrote:
> 
> hpa, we need this patch: https://lkml.org/lkml/2012/8/24/469 and the above
> 2 from Yinghai to handle corner case E820 layouts.
> 

I can apply Yinghai's patches, but the above patch no longer applies.
Could you refresh it on top of tip:x86/u, please?

	-hpa


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [tip:x86/urgent] x86, mm: Trim memory in memblock to be page aligned
  2012-10-22 23:35                                 ` Yinghai Lu
  2012-10-24 16:48                                   ` Jacob Shin
@ 2012-10-24 19:01                                   ` tip-bot for Yinghai Lu
  2012-10-24 19:02                                   ` [tip:x86/urgent] x86, mm: Use memblock memory loop instead of e820_RAM tip-bot for Yinghai Lu
  2 siblings, 0 replies; 32+ messages in thread
From: tip-bot for Yinghai Lu @ 2012-10-24 19:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, yinghai, stable, jacob.shin, tglx, hpa

Commit-ID:  6ede1fd3cb404c0016de6ac529df46d561bd558b
Gitweb:     http://git.kernel.org/tip/6ede1fd3cb404c0016de6ac529df46d561bd558b
Author:     Yinghai Lu <yinghai@kernel.org>
AuthorDate: Mon, 22 Oct 2012 16:35:18 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Wed, 24 Oct 2012 11:52:21 -0700

x86, mm: Trim memory in memblock to be page aligned

We will not map partial pages, so need to make sure memblock
allocation will not allocate those bytes out.

Also we will use for_each_mem_pfn_range() to loop to map memory
range to keep them consistent.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/CAE9FiQVZirvaBMFYRfXMmWEcHbKSicQEHz4VAwUv0xFCk51ZNw@mail.gmail.com
Acked-by: Jacob Shin <jacob.shin@amd.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org>
---
 arch/x86/kernel/e820.c   |    3 +++
 include/linux/memblock.h |    1 +
 mm/memblock.c            |   24 ++++++++++++++++++++++++
 3 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index ed858e9..df06ade 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -1077,6 +1077,9 @@ void __init memblock_x86_fill(void)
 		memblock_add(ei->addr, ei->size);
 	}
 
+	/* throw away partial pages */
+	memblock_trim_memory(PAGE_SIZE);
+
 	memblock_dump_all();
 }
 
diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 569d67d..d452ee1 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -57,6 +57,7 @@ int memblock_add(phys_addr_t base, phys_addr_t size);
 int memblock_remove(phys_addr_t base, phys_addr_t size);
 int memblock_free(phys_addr_t base, phys_addr_t size);
 int memblock_reserve(phys_addr_t base, phys_addr_t size);
+void memblock_trim_memory(phys_addr_t align);
 
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
diff --git a/mm/memblock.c b/mm/memblock.c
index 931eef1..6259055 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -930,6 +930,30 @@ int __init_memblock memblock_is_region_reserved(phys_addr_t base, phys_addr_t si
 	return memblock_overlaps_region(&memblock.reserved, base, size) >= 0;
 }
 
+void __init_memblock memblock_trim_memory(phys_addr_t align)
+{
+	int i;
+	phys_addr_t start, end, orig_start, orig_end;
+	struct memblock_type *mem = &memblock.memory;
+
+	for (i = 0; i < mem->cnt; i++) {
+		orig_start = mem->regions[i].base;
+		orig_end = mem->regions[i].base + mem->regions[i].size;
+		start = round_up(orig_start, align);
+		end = round_down(orig_end, align);
+
+		if (start == orig_start && end == orig_end)
+			continue;
+
+		if (start < end) {
+			mem->regions[i].base = start;
+			mem->regions[i].size = end - start;
+		} else {
+			memblock_remove_region(mem, i);
+			i--;
+		}
+	}
+}
 
 void __init_memblock memblock_set_current_limit(phys_addr_t limit)
 {

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [tip:x86/urgent] x86, mm: Use memblock memory loop instead of e820_RAM
  2012-10-22 23:35                                 ` Yinghai Lu
  2012-10-24 16:48                                   ` Jacob Shin
  2012-10-24 19:01                                   ` [tip:x86/urgent] x86, mm: Trim memory in memblock to be page aligned tip-bot for Yinghai Lu
@ 2012-10-24 19:02                                   ` tip-bot for Yinghai Lu
  2 siblings, 0 replies; 32+ messages in thread
From: tip-bot for Yinghai Lu @ 2012-10-24 19:02 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, yinghai, stable, jacob.shin, tglx, hpa

Commit-ID:  1f2ff682ac951ed82cc043cf140d2851084512df
Gitweb:     http://git.kernel.org/tip/1f2ff682ac951ed82cc043cf140d2851084512df
Author:     Yinghai Lu <yinghai@kernel.org>
AuthorDate: Mon, 22 Oct 2012 16:35:18 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Wed, 24 Oct 2012 11:52:36 -0700

x86, mm: Use memblock memory loop instead of e820_RAM

We need to handle E820_RAM and E820_RESERVED_KERNEL at the same time.

Also memblock has page aligned range for ram, so we could avoid mapping
partial pages.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/CAE9FiQVZirvaBMFYRfXMmWEcHbKSicQEHz4VAwUv0xFCk51ZNw@mail.gmail.com
Acked-by: Jacob Shin <jacob.shin@amd.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@vger.kernel.org>
---
 arch/x86/kernel/setup.c |   15 ++++++++-------
 1 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 468e98d..5d888af 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -921,18 +921,19 @@ void __init setup_arch(char **cmdline_p)
 #ifdef CONFIG_X86_64
 	if (max_pfn > max_low_pfn) {
 		int i;
-		for (i = 0; i < e820.nr_map; i++) {
-			struct e820entry *ei = &e820.map[i];
+		unsigned long start, end;
+		unsigned long start_pfn, end_pfn;
 
-			if (ei->addr + ei->size <= 1UL << 32)
-				continue;
+		for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn,
+							 NULL) {
 
-			if (ei->type == E820_RESERVED)
+			end = PFN_PHYS(end_pfn);
+			if (end <= (1UL<<32))
 				continue;
 
+			start = PFN_PHYS(start_pfn);
 			max_pfn_mapped = init_memory_mapping(
-				ei->addr < 1UL << 32 ? 1UL << 32 : ei->addr,
-				ei->addr + ei->size);
+						max((1UL<<32), start), end);
 		}
 
 		/* can we preseve max_low_pfn ?*/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-24 18:53                                     ` H. Peter Anvin
@ 2012-10-24 19:53                                       ` Jacob Shin
  2012-10-24 21:49                                         ` [tip:x86/urgent] x86, mm: Find_early_table_space based on ranges that are actually being mapped tip-bot for Jacob Shin
  0 siblings, 1 reply; 32+ messages in thread
From: Jacob Shin @ 2012-10-24 19:53 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Yinghai Lu, H. Peter Anvin, Tom Rini, linux-kernel

On Wed, Oct 24, 2012 at 11:53:16AM -0700, H. Peter Anvin wrote:
> On 10/24/2012 09:48 AM, Jacob Shin wrote:
> > 
> > hpa, we need this patch: https://lkml.org/lkml/2012/8/24/469 and the above
> > 2 from Yinghai to handle corner case E820 layouts.
> > 
> 
> I can apply Yinghai's patches, but the above patch no longer applies.
> Could you refresh it on top of tip:x86/u, please?

Sorry about that, it applied to Linus's 3.7-rc2 so I just assumed .. :-(

>From 7d2a67f6b435ede202bdf5d1982f9b5af90cce34 Mon Sep 17 00:00:00 2001
From: Jacob Shin <jacob.shin@amd.com>
Date: Wed, 24 Oct 2012 14:24:44 -0500
Subject: [PATCH] x86/mm: find_early_table_space based on ranges that are
 actually being mapped

Current logic finds enough space for direct mapping page tables from 0
to end. Instead, we only need to find enough space to cover mr[0].start
to mr[nr_range].end -- the range that is actually being mapped by
init_memory_mapping()

This is needed after 1bbbbe779aabe1f0768c2bf8f8c0a5583679b54a, to address
the panic reported here:

  https://lkml.org/lkml/2012/10/20/160
  https://lkml.org/lkml/2012/10/21/157

Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Tested-by: Tom Rini <trini@ti.com>

---
 arch/x86/mm/init.c |   70 ++++++++++++++++++++++++++++++----------------------
 1 file changed, 41 insertions(+), 29 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 8653b3a..bc287d6 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -29,36 +29,54 @@ int direct_gbpages
 #endif
 ;
 
-static void __init find_early_table_space(unsigned long end, int use_pse,
-					  int use_gbpages)
+struct map_range {
+	unsigned long start;
+	unsigned long end;
+	unsigned page_size_mask;
+};
+
+/*
+ * First calculate space needed for kernel direct mapping page tables to cover
+ * mr[0].start to mr[nr_range - 1].end, while accounting for possible 2M and 1GB
+ * pages. Then find enough contiguous space for those page tables.
+ */
+static void __init find_early_table_space(struct map_range *mr, int nr_range)
 {
-	unsigned long puds, pmds, ptes, tables, start = 0, good_end = end;
+	int i;
+	unsigned long puds = 0, pmds = 0, ptes = 0, tables;
+	unsigned long start = 0, good_end;
 	phys_addr_t base;
 
-	puds = (end + PUD_SIZE - 1) >> PUD_SHIFT;
-	tables = roundup(puds * sizeof(pud_t), PAGE_SIZE);
+	for (i = 0; i < nr_range; i++) {
+		unsigned long range, extra;
 
-	if (use_gbpages) {
-		unsigned long extra;
+		range = mr[i].end - mr[i].start;
+		puds += (range + PUD_SIZE - 1) >> PUD_SHIFT;
 
-		extra = end - ((end>>PUD_SHIFT) << PUD_SHIFT);
-		pmds = (extra + PMD_SIZE - 1) >> PMD_SHIFT;
-	} else
-		pmds = (end + PMD_SIZE - 1) >> PMD_SHIFT;
-
-	tables += roundup(pmds * sizeof(pmd_t), PAGE_SIZE);
+		if (mr[i].page_size_mask & (1 << PG_LEVEL_1G)) {
+			extra = range - ((range >> PUD_SHIFT) << PUD_SHIFT);
+			pmds += (extra + PMD_SIZE - 1) >> PMD_SHIFT;
+		} else {
+			pmds += (range + PMD_SIZE - 1) >> PMD_SHIFT;
+		}
 
-	if (use_pse) {
-		unsigned long extra;
-
-		extra = end - ((end>>PMD_SHIFT) << PMD_SHIFT);
+		if (mr[i].page_size_mask & (1 << PG_LEVEL_2M)) {
+			extra = range - ((range >> PMD_SHIFT) << PMD_SHIFT);
 #ifdef CONFIG_X86_32
-		extra += PMD_SIZE;
+			extra += PMD_SIZE;
 #endif
-		ptes = (extra + PAGE_SIZE - 1) >> PAGE_SHIFT;
-	} else
-		ptes = (end + PAGE_SIZE - 1) >> PAGE_SHIFT;
+			/* The first 2/4M doesn't use large pages. */
+			if (mr[i].start < PMD_SIZE)
+				extra += range;
+
+			ptes += (extra + PAGE_SIZE - 1) >> PAGE_SHIFT;
+		} else {
+			ptes += (range + PAGE_SIZE - 1) >> PAGE_SHIFT;
+		}
+	}
 
+	tables = roundup(puds * sizeof(pud_t), PAGE_SIZE);
+	tables += roundup(pmds * sizeof(pmd_t), PAGE_SIZE);
 	tables += roundup(ptes * sizeof(pte_t), PAGE_SIZE);
 
 #ifdef CONFIG_X86_32
@@ -76,7 +94,7 @@ static void __init find_early_table_space(unsigned long end, int use_pse,
 	pgt_buf_top = pgt_buf_start + (tables >> PAGE_SHIFT);
 
 	printk(KERN_DEBUG "kernel direct mapping tables up to %#lx @ [mem %#010lx-%#010lx]\n",
-		end - 1, pgt_buf_start << PAGE_SHIFT,
+		mr[nr_range - 1].end - 1, pgt_buf_start << PAGE_SHIFT,
 		(pgt_buf_top << PAGE_SHIFT) - 1);
 }
 
@@ -85,12 +103,6 @@ void __init native_pagetable_reserve(u64 start, u64 end)
 	memblock_reserve(start, end - start);
 }
 
-struct map_range {
-	unsigned long start;
-	unsigned long end;
-	unsigned page_size_mask;
-};
-
 #ifdef CONFIG_X86_32
 #define NR_RANGE_MR 3
 #else /* CONFIG_X86_64 */
@@ -263,7 +275,7 @@ unsigned long __init_refok init_memory_mapping(unsigned long start,
 	 * nodes are discovered.
 	 */
 	if (!after_bootmem)
-		find_early_table_space(end, use_pse, use_gbpages);
+		find_early_table_space(mr, nr_range);
 
 	for (i = 0; i < nr_range; i++)
 		ret = kernel_physical_mapping_init(mr[i].start, mr[i].end,
-- 
1.7.9.5



^ permalink raw reply	[flat|nested] 32+ messages in thread

* [tip:x86/urgent] x86, mm: Find_early_table_space based on ranges that are actually being mapped
  2012-10-24 19:53                                       ` Jacob Shin
@ 2012-10-24 21:49                                         ` tip-bot for Jacob Shin
  2012-10-25  6:42                                           ` Yinghai Lu
  0 siblings, 1 reply; 32+ messages in thread
From: tip-bot for Jacob Shin @ 2012-10-24 21:49 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, jacob.shin, tglx, hpa, trini

Commit-ID:  844ab6f993b1d32eb40512503d35ff6ad0c57030
Gitweb:     http://git.kernel.org/tip/844ab6f993b1d32eb40512503d35ff6ad0c57030
Author:     Jacob Shin <jacob.shin@amd.com>
AuthorDate: Wed, 24 Oct 2012 14:24:44 -0500
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Wed, 24 Oct 2012 13:37:04 -0700

x86, mm: Find_early_table_space based on ranges that are actually being mapped

Current logic finds enough space for direct mapping page tables from 0
to end. Instead, we only need to find enough space to cover mr[0].start
to mr[nr_range].end -- the range that is actually being mapped by
init_memory_mapping()

This is needed after 1bbbbe779aabe1f0768c2bf8f8c0a5583679b54a, to address
the panic reported here:

  https://lkml.org/lkml/2012/10/20/160
  https://lkml.org/lkml/2012/10/21/157

Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Link: http://lkml.kernel.org/r/20121024195311.GB11779@jshin-Toonie
Tested-by: Tom Rini <trini@ti.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 arch/x86/mm/init.c |   70 ++++++++++++++++++++++++++++++---------------------
 1 files changed, 41 insertions(+), 29 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 8653b3a..bc287d6 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -29,36 +29,54 @@ int direct_gbpages
 #endif
 ;
 
-static void __init find_early_table_space(unsigned long end, int use_pse,
-					  int use_gbpages)
+struct map_range {
+	unsigned long start;
+	unsigned long end;
+	unsigned page_size_mask;
+};
+
+/*
+ * First calculate space needed for kernel direct mapping page tables to cover
+ * mr[0].start to mr[nr_range - 1].end, while accounting for possible 2M and 1GB
+ * pages. Then find enough contiguous space for those page tables.
+ */
+static void __init find_early_table_space(struct map_range *mr, int nr_range)
 {
-	unsigned long puds, pmds, ptes, tables, start = 0, good_end = end;
+	int i;
+	unsigned long puds = 0, pmds = 0, ptes = 0, tables;
+	unsigned long start = 0, good_end;
 	phys_addr_t base;
 
-	puds = (end + PUD_SIZE - 1) >> PUD_SHIFT;
-	tables = roundup(puds * sizeof(pud_t), PAGE_SIZE);
+	for (i = 0; i < nr_range; i++) {
+		unsigned long range, extra;
 
-	if (use_gbpages) {
-		unsigned long extra;
+		range = mr[i].end - mr[i].start;
+		puds += (range + PUD_SIZE - 1) >> PUD_SHIFT;
 
-		extra = end - ((end>>PUD_SHIFT) << PUD_SHIFT);
-		pmds = (extra + PMD_SIZE - 1) >> PMD_SHIFT;
-	} else
-		pmds = (end + PMD_SIZE - 1) >> PMD_SHIFT;
-
-	tables += roundup(pmds * sizeof(pmd_t), PAGE_SIZE);
+		if (mr[i].page_size_mask & (1 << PG_LEVEL_1G)) {
+			extra = range - ((range >> PUD_SHIFT) << PUD_SHIFT);
+			pmds += (extra + PMD_SIZE - 1) >> PMD_SHIFT;
+		} else {
+			pmds += (range + PMD_SIZE - 1) >> PMD_SHIFT;
+		}
 
-	if (use_pse) {
-		unsigned long extra;
-
-		extra = end - ((end>>PMD_SHIFT) << PMD_SHIFT);
+		if (mr[i].page_size_mask & (1 << PG_LEVEL_2M)) {
+			extra = range - ((range >> PMD_SHIFT) << PMD_SHIFT);
 #ifdef CONFIG_X86_32
-		extra += PMD_SIZE;
+			extra += PMD_SIZE;
 #endif
-		ptes = (extra + PAGE_SIZE - 1) >> PAGE_SHIFT;
-	} else
-		ptes = (end + PAGE_SIZE - 1) >> PAGE_SHIFT;
+			/* The first 2/4M doesn't use large pages. */
+			if (mr[i].start < PMD_SIZE)
+				extra += range;
+
+			ptes += (extra + PAGE_SIZE - 1) >> PAGE_SHIFT;
+		} else {
+			ptes += (range + PAGE_SIZE - 1) >> PAGE_SHIFT;
+		}
+	}
 
+	tables = roundup(puds * sizeof(pud_t), PAGE_SIZE);
+	tables += roundup(pmds * sizeof(pmd_t), PAGE_SIZE);
 	tables += roundup(ptes * sizeof(pte_t), PAGE_SIZE);
 
 #ifdef CONFIG_X86_32
@@ -76,7 +94,7 @@ static void __init find_early_table_space(unsigned long end, int use_pse,
 	pgt_buf_top = pgt_buf_start + (tables >> PAGE_SHIFT);
 
 	printk(KERN_DEBUG "kernel direct mapping tables up to %#lx @ [mem %#010lx-%#010lx]\n",
-		end - 1, pgt_buf_start << PAGE_SHIFT,
+		mr[nr_range - 1].end - 1, pgt_buf_start << PAGE_SHIFT,
 		(pgt_buf_top << PAGE_SHIFT) - 1);
 }
 
@@ -85,12 +103,6 @@ void __init native_pagetable_reserve(u64 start, u64 end)
 	memblock_reserve(start, end - start);
 }
 
-struct map_range {
-	unsigned long start;
-	unsigned long end;
-	unsigned page_size_mask;
-};
-
 #ifdef CONFIG_X86_32
 #define NR_RANGE_MR 3
 #else /* CONFIG_X86_64 */
@@ -263,7 +275,7 @@ unsigned long __init_refok init_memory_mapping(unsigned long start,
 	 * nodes are discovered.
 	 */
 	if (!after_bootmem)
-		find_early_table_space(end, use_pse, use_gbpages);
+		find_early_table_space(mr, nr_range);
 
 	for (i = 0; i < nr_range; i++)
 		ret = kernel_physical_mapping_init(mr[i].start, mr[i].end,

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [tip:x86/urgent] x86, mm: Find_early_table_space based on ranges that are actually being mapped
  2012-10-24 21:49                                         ` [tip:x86/urgent] x86, mm: Find_early_table_space based on ranges that are actually being mapped tip-bot for Jacob Shin
@ 2012-10-25  6:42                                           ` Yinghai Lu
  2012-10-25  7:55                                             ` Ingo Molnar
  0 siblings, 1 reply; 32+ messages in thread
From: Yinghai Lu @ 2012-10-25  6:42 UTC (permalink / raw)
  To: mingo, hpa, linux-kernel, jacob.shin, tglx, trini, hpa; +Cc: linux-tip-commits

On Wed, Oct 24, 2012 at 2:49 PM, tip-bot for Jacob Shin
<jacob.shin@amd.com> wrote:
> Commit-ID:  844ab6f993b1d32eb40512503d35ff6ad0c57030
> Gitweb:     http://git.kernel.org/tip/844ab6f993b1d32eb40512503d35ff6ad0c57030
> Author:     Jacob Shin <jacob.shin@amd.com>
> AuthorDate: Wed, 24 Oct 2012 14:24:44 -0500
> Committer:  H. Peter Anvin <hpa@linux.intel.com>
> CommitDate: Wed, 24 Oct 2012 13:37:04 -0700
>
> x86, mm: Find_early_table_space based on ranges that are actually being mapped
>
> Current logic finds enough space for direct mapping page tables from 0
> to end. Instead, we only need to find enough space to cover mr[0].start
> to mr[nr_range].end -- the range that is actually being mapped by
> init_memory_mapping()
>
> This is needed after 1bbbbe779aabe1f0768c2bf8f8c0a5583679b54a, to address
> the panic reported here:
>
>   https://lkml.org/lkml/2012/10/20/160
>   https://lkml.org/lkml/2012/10/21/157
>
> Signed-off-by: Jacob Shin <jacob.shin@amd.com>
> Link: http://lkml.kernel.org/r/20121024195311.GB11779@jshin-Toonie
> Tested-by: Tom Rini <trini@ti.com>
> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
> ---
>  arch/x86/mm/init.c |   70 ++++++++++++++++++++++++++++++---------------------
>  1 files changed, 41 insertions(+), 29 deletions(-)
>
> diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> index 8653b3a..bc287d6 100644
> --- a/arch/x86/mm/init.c
> +++ b/arch/x86/mm/init.c
> @@ -29,36 +29,54 @@ int direct_gbpages
>  #endif
>  ;
>
> -static void __init find_early_table_space(unsigned long end, int use_pse,
> -                                         int use_gbpages)
> +struct map_range {
> +       unsigned long start;
> +       unsigned long end;
> +       unsigned page_size_mask;
> +};
> +
> +/*
> + * First calculate space needed for kernel direct mapping page tables to cover
> + * mr[0].start to mr[nr_range - 1].end, while accounting for possible 2M and 1GB
> + * pages. Then find enough contiguous space for those page tables.
> + */
> +static void __init find_early_table_space(struct map_range *mr, int nr_range)
>  {
> -       unsigned long puds, pmds, ptes, tables, start = 0, good_end = end;
> +       int i;
> +       unsigned long puds = 0, pmds = 0, ptes = 0, tables;
> +       unsigned long start = 0, good_end;
>         phys_addr_t base;
>
> -       puds = (end + PUD_SIZE - 1) >> PUD_SHIFT;
> -       tables = roundup(puds * sizeof(pud_t), PAGE_SIZE);
> +       for (i = 0; i < nr_range; i++) {
> +               unsigned long range, extra;
>
> -       if (use_gbpages) {
> -               unsigned long extra;
> +               range = mr[i].end - mr[i].start;
> +               puds += (range + PUD_SIZE - 1) >> PUD_SHIFT;
>
> -               extra = end - ((end>>PUD_SHIFT) << PUD_SHIFT);
> -               pmds = (extra + PMD_SIZE - 1) >> PMD_SHIFT;
> -       } else
> -               pmds = (end + PMD_SIZE - 1) >> PMD_SHIFT;
> -
> -       tables += roundup(pmds * sizeof(pmd_t), PAGE_SIZE);
> +               if (mr[i].page_size_mask & (1 << PG_LEVEL_1G)) {
> +                       extra = range - ((range >> PUD_SHIFT) << PUD_SHIFT);
> +                       pmds += (extra + PMD_SIZE - 1) >> PMD_SHIFT;
> +               } else {
> +                       pmds += (range + PMD_SIZE - 1) >> PMD_SHIFT;
> +               }
>
> -       if (use_pse) {
> -               unsigned long extra;
> -
> -               extra = end - ((end>>PMD_SHIFT) << PMD_SHIFT);
> +               if (mr[i].page_size_mask & (1 << PG_LEVEL_2M)) {
> +                       extra = range - ((range >> PMD_SHIFT) << PMD_SHIFT);
>  #ifdef CONFIG_X86_32
> -               extra += PMD_SIZE;
> +                       extra += PMD_SIZE;
>  #endif
> -               ptes = (extra + PAGE_SIZE - 1) >> PAGE_SHIFT;
> -       } else
> -               ptes = (end + PAGE_SIZE - 1) >> PAGE_SHIFT;
> +                       /* The first 2/4M doesn't use large pages. */
> +                       if (mr[i].start < PMD_SIZE)
> +                               extra += range;

those three lines should be added back.

it just get reverted in 7b16bbf9

    Revert "x86/mm: Fix the size calculation of mapping tables"


> +
> +                       ptes += (extra + PAGE_SIZE - 1) >> PAGE_SHIFT;
> +               } else {
> +                       ptes += (range + PAGE_SIZE - 1) >> PAGE_SHIFT;
> +               }
> +       }
>
> +       tables = roundup(puds * sizeof(pud_t), PAGE_SIZE);
> +       tables += roundup(pmds * sizeof(pmd_t), PAGE_SIZE);
>         tables += roundup(ptes * sizeof(pte_t), PAGE_SIZE);
>
>  #ifdef CONFIG_X86_32
> @@ -76,7 +94,7 @@ static void __init find_early_table_space(unsigned long end, int use_pse,
>         pgt_buf_top = pgt_buf_start + (tables >> PAGE_SHIFT);
>
>         printk(KERN_DEBUG "kernel direct mapping tables up to %#lx @ [mem %#010lx-%#010lx]\n",
> -               end - 1, pgt_buf_start << PAGE_SHIFT,
> +               mr[nr_range - 1].end - 1, pgt_buf_start << PAGE_SHIFT,
>                 (pgt_buf_top << PAGE_SHIFT) - 1);
>  }
>
> @@ -85,12 +103,6 @@ void __init native_pagetable_reserve(u64 start, u64 end)
>         memblock_reserve(start, end - start);
>  }
>
> -struct map_range {
> -       unsigned long start;
> -       unsigned long end;
> -       unsigned page_size_mask;
> -};
> -
>  #ifdef CONFIG_X86_32
>  #define NR_RANGE_MR 3
>  #else /* CONFIG_X86_64 */
> @@ -263,7 +275,7 @@ unsigned long __init_refok init_memory_mapping(unsigned long start,
>          * nodes are discovered.
>          */
>         if (!after_bootmem)
> -               find_early_table_space(end, use_pse, use_gbpages);
> +               find_early_table_space(mr, nr_range);
>
>         for (i = 0; i < nr_range; i++)
>                 ret = kernel_physical_mapping_init(mr[i].start, mr[i].end,
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [tip:x86/urgent] x86, mm: Find_early_table_space based on ranges that are actually being mapped
  2012-10-25  6:42                                           ` Yinghai Lu
@ 2012-10-25  7:55                                             ` Ingo Molnar
  2012-10-25 14:33                                               ` Yinghai Lu
  0 siblings, 1 reply; 32+ messages in thread
From: Ingo Molnar @ 2012-10-25  7:55 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: hpa, linux-kernel, jacob.shin, tglx, trini, hpa, linux-tip-commits


* Yinghai Lu <yinghai@kernel.org> wrote:

> On Wed, Oct 24, 2012 at 2:49 PM, tip-bot for Jacob Shin
> <jacob.shin@amd.com> wrote:
> > Commit-ID:  844ab6f993b1d32eb40512503d35ff6ad0c57030
> > Gitweb:     http://git.kernel.org/tip/844ab6f993b1d32eb40512503d35ff6ad0c57030
> > Author:     Jacob Shin <jacob.shin@amd.com>
> > AuthorDate: Wed, 24 Oct 2012 14:24:44 -0500
> > Committer:  H. Peter Anvin <hpa@linux.intel.com>
> > CommitDate: Wed, 24 Oct 2012 13:37:04 -0700
> >
> > x86, mm: Find_early_table_space based on ranges that are actually being mapped
> >
> > Current logic finds enough space for direct mapping page tables from 0
> > to end. Instead, we only need to find enough space to cover mr[0].start
> > to mr[nr_range].end -- the range that is actually being mapped by
> > init_memory_mapping()
> >
> > This is needed after 1bbbbe779aabe1f0768c2bf8f8c0a5583679b54a, to address
> > the panic reported here:
> >
> >   https://lkml.org/lkml/2012/10/20/160
> >   https://lkml.org/lkml/2012/10/21/157
> >
> > Signed-off-by: Jacob Shin <jacob.shin@amd.com>
> > Link: http://lkml.kernel.org/r/20121024195311.GB11779@jshin-Toonie
> > Tested-by: Tom Rini <trini@ti.com>
> > Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
> > ---
> >  arch/x86/mm/init.c |   70 ++++++++++++++++++++++++++++++---------------------
> >  1 files changed, 41 insertions(+), 29 deletions(-)
> >
> > diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> > index 8653b3a..bc287d6 100644
> > --- a/arch/x86/mm/init.c
> > +++ b/arch/x86/mm/init.c
> > @@ -29,36 +29,54 @@ int direct_gbpages
> >  #endif
> >  ;
> >
> > -static void __init find_early_table_space(unsigned long end, int use_pse,
> > -                                         int use_gbpages)
> > +struct map_range {
> > +       unsigned long start;
> > +       unsigned long end;
> > +       unsigned page_size_mask;
> > +};
> > +
> > +/*
> > + * First calculate space needed for kernel direct mapping page tables to cover
> > + * mr[0].start to mr[nr_range - 1].end, while accounting for possible 2M and 1GB
> > + * pages. Then find enough contiguous space for those page tables.
> > + */
> > +static void __init find_early_table_space(struct map_range *mr, int nr_range)
> >  {
> > -       unsigned long puds, pmds, ptes, tables, start = 0, good_end = end;
> > +       int i;
> > +       unsigned long puds = 0, pmds = 0, ptes = 0, tables;
> > +       unsigned long start = 0, good_end;
> >         phys_addr_t base;
> >
> > -       puds = (end + PUD_SIZE - 1) >> PUD_SHIFT;
> > -       tables = roundup(puds * sizeof(pud_t), PAGE_SIZE);
> > +       for (i = 0; i < nr_range; i++) {
> > +               unsigned long range, extra;
> >
> > -       if (use_gbpages) {
> > -               unsigned long extra;
> > +               range = mr[i].end - mr[i].start;
> > +               puds += (range + PUD_SIZE - 1) >> PUD_SHIFT;
> >
> > -               extra = end - ((end>>PUD_SHIFT) << PUD_SHIFT);
> > -               pmds = (extra + PMD_SIZE - 1) >> PMD_SHIFT;
> > -       } else
> > -               pmds = (end + PMD_SIZE - 1) >> PMD_SHIFT;
> > -
> > -       tables += roundup(pmds * sizeof(pmd_t), PAGE_SIZE);
> > +               if (mr[i].page_size_mask & (1 << PG_LEVEL_1G)) {
> > +                       extra = range - ((range >> PUD_SHIFT) << PUD_SHIFT);
> > +                       pmds += (extra + PMD_SIZE - 1) >> PMD_SHIFT;
> > +               } else {
> > +                       pmds += (range + PMD_SIZE - 1) >> PMD_SHIFT;
> > +               }
> >
> > -       if (use_pse) {
> > -               unsigned long extra;
> > -
> > -               extra = end - ((end>>PMD_SHIFT) << PMD_SHIFT);
> > +               if (mr[i].page_size_mask & (1 << PG_LEVEL_2M)) {
> > +                       extra = range - ((range >> PMD_SHIFT) << PMD_SHIFT);
> >  #ifdef CONFIG_X86_32
> > -               extra += PMD_SIZE;
> > +                       extra += PMD_SIZE;
> >  #endif
> > -               ptes = (extra + PAGE_SIZE - 1) >> PAGE_SHIFT;
> > -       } else
> > -               ptes = (end + PAGE_SIZE - 1) >> PAGE_SHIFT;
> > +                       /* The first 2/4M doesn't use large pages. */
> > +                       if (mr[i].start < PMD_SIZE)
> > +                               extra += range;
> 
> those three lines should be added back.
> 
> it just get reverted in 7b16bbf9

Could you please send a delta patch against tip:x86/urgent?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [tip:x86/urgent] x86, mm: Find_early_table_space based on ranges that are actually being mapped
  2012-10-25  7:55                                             ` Ingo Molnar
@ 2012-10-25 14:33                                               ` Yinghai Lu
  2012-10-25 22:23                                                 ` Jacob Shin
  2012-10-25 23:31                                                 ` [tip:x86/urgent] x86, mm: Undo incorrect revert in arch/x86/mm/ init.c tip-bot for Yinghai Lu
  0 siblings, 2 replies; 32+ messages in thread
From: Yinghai Lu @ 2012-10-25 14:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: hpa, linux-kernel, jacob.shin, tglx, trini, hpa, linux-tip-commits

[-- Attachment #1: Type: text/plain, Size: 535 bytes --]

On Thu, Oct 25, 2012 at 12:55 AM, Ingo Molnar <mingo@kernel.org> wrote:
>> > -               ptes = (end + PAGE_SIZE - 1) >> PAGE_SHIFT;
>> > +                       /* The first 2/4M doesn't use large pages. */
>> > +                       if (mr[i].start < PMD_SIZE)
>> > +                               extra += range;
>>
>> those three lines should be added back.

missed "not" ...

>>
>> it just get reverted in 7b16bbf9
>
> Could you please send a delta patch against tip:x86/urgent?

please check attached one.

Thanks

Yinghai

[-- Attachment #2: remove_wrong_addback.patch --]
[-- Type: application/octet-stream, Size: 859 bytes --]

Subject: [PATCH] x86, mm: Remove wrong add back

commit 844ab6f9
    x86, mm: Find_early_table_space based on ranges that are actually being mapped
add back some lines back wrongly that has been reverted in commit 7b16bbf97
    Revert "x86/mm: Fix the size calculation of mapping tables"

remove them again.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index bc287d6..d7aea41 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -65,10 +65,6 @@ static void __init find_early_table_space(struct map_range *mr, int nr_range)
 #ifdef CONFIG_X86_32
 			extra += PMD_SIZE;
 #endif
-			/* The first 2/4M doesn't use large pages. */
-			if (mr[i].start < PMD_SIZE)
-				extra += range;
-
 			ptes += (extra + PAGE_SIZE - 1) >> PAGE_SHIFT;
 		} else {
 			ptes += (range + PAGE_SIZE - 1) >> PAGE_SHIFT;

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [tip:x86/urgent] x86, mm: Find_early_table_space based on ranges that are actually being mapped
  2012-10-25 14:33                                               ` Yinghai Lu
@ 2012-10-25 22:23                                                 ` Jacob Shin
  2012-10-25 23:31                                                 ` [tip:x86/urgent] x86, mm: Undo incorrect revert in arch/x86/mm/ init.c tip-bot for Yinghai Lu
  1 sibling, 0 replies; 32+ messages in thread
From: Jacob Shin @ 2012-10-25 22:23 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Ingo Molnar, hpa, linux-kernel, tglx, trini, hpa, linux-tip-commits

On Thu, Oct 25, 2012 at 07:33:32AM -0700, Yinghai Lu wrote:
> On Thu, Oct 25, 2012 at 12:55 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >> > -               ptes = (end + PAGE_SIZE - 1) >> PAGE_SHIFT;
> >> > +                       /* The first 2/4M doesn't use large pages. */
> >> > +                       if (mr[i].start < PMD_SIZE)
> >> > +                               extra += range;
> >>
> >> those three lines should be added back.
> 
> missed "not" ...
> 
> >>
> >> it just get reverted in 7b16bbf9
> >
> > Could you please send a delta patch against tip:x86/urgent?
> 
> please check attached one.

Acked-by: Jacob Shin <jacob.shin@amd.com>

Sorry about that, I just retrofitted the patch and didn't see those lines got
reverted out,

Thanks!

> 
> Thanks
> 
> Yinghai




^ permalink raw reply	[flat|nested] 32+ messages in thread

* [tip:x86/urgent] x86, mm: Undo incorrect revert in arch/x86/mm/ init.c
  2012-10-25 14:33                                               ` Yinghai Lu
  2012-10-25 22:23                                                 ` Jacob Shin
@ 2012-10-25 23:31                                                 ` tip-bot for Yinghai Lu
  1 sibling, 0 replies; 32+ messages in thread
From: tip-bot for Yinghai Lu @ 2012-10-25 23:31 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, yinghai, jacob.shin, tglx, hpa

Commit-ID:  f82f64dd9f485e13f29f369772d4a0e868e5633a
Gitweb:     http://git.kernel.org/tip/f82f64dd9f485e13f29f369772d4a0e868e5633a
Author:     Yinghai Lu <yinghai@kernel.org>
AuthorDate: Thu, 25 Oct 2012 15:45:26 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Thu, 25 Oct 2012 15:45:45 -0700

x86, mm: Undo incorrect revert in arch/x86/mm/init.c

Commit

    844ab6f9 x86, mm: Find_early_table_space based on ranges that are actually being mapped

added back some lines back wrongly that has been removed in commit

    7b16bbf97 Revert "x86/mm: Fix the size calculation of mapping tables"

remove them again.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/CAE9FiQW_vuaYQbmagVnxT2DGsYc=9tNeAbdBq53sYkitPOwxSQ@mail.gmail.com
Acked-by: Jacob Shin <jacob.shin@amd.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 arch/x86/mm/init.c |    4 ----
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index bc287d6..d7aea41 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -65,10 +65,6 @@ static void __init find_early_table_space(struct map_range *mr, int nr_range)
 #ifdef CONFIG_X86_32
 			extra += PMD_SIZE;
 #endif
-			/* The first 2/4M doesn't use large pages. */
-			if (mr[i].start < PMD_SIZE)
-				extra += range;
-
 			ptes += (extra + PAGE_SIZE - 1) >> PAGE_SHIFT;
 		} else {
 			ptes += (range + PAGE_SIZE - 1) >> PAGE_SHIFT;

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
  2012-10-22 14:40               ` Jacob Shin
  2012-10-22 18:05                 ` Yinghai Lu
@ 2012-10-28 20:48                 ` Tom Rini
  1 sibling, 0 replies; 32+ messages in thread
From: Tom Rini @ 2012-10-28 20:48 UTC (permalink / raw)
  To: Jacob Shin; +Cc: hpa, Yinghai Lu, linux-kernel, H. Peter Anvin, stable

On 10/22/12 07:40, Jacob Shin wrote:
> On Sun, Oct 21, 2012 at 02:23:58PM -0700, Tom Rini wrote:
>> On 10/21/12 14:06, Jacob Shin wrote:
>>> Ah, sorry, this one should apply on top of 3.7-rc2:
>>>
>>> https://lkml.org/lkml/2012/8/24/469
>>>
>>> Could you try that? Just that single patch, not the whole patchset.
>>
>> That fixes it, replied with a note and Tested-by, thanks!
> 
> Thanks for testing!
> 
> hpa, so sorry, but it looks like we need one more patch [PATCH 2/5] x86:
> find_early_table_space based on memory ranges that are being mapped:
> 
>   https://lkml.org/lkml/2012/8/24/469
> 
> on top of this, because find_early_table_space calculation does not come out
> correctly for this particular E820 table that Tom has:
> 
>   http://pastebin.com/4eSPEAvB
> 
> The reason why we hit this now, and never hit it before is because before the
> start was hard coded to 1UL<<32.

As a final follow-up, v3.7-rc3 does not have the problem I reported
previously.

-- 
Tom


^ permalink raw reply	[flat|nested] 32+ messages in thread

* BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot
@ 2012-10-21  0:06 Tom Rini
  0 siblings, 0 replies; 32+ messages in thread
From: Tom Rini @ 2012-10-21  0:06 UTC (permalink / raw)
  To: Jacob Shin, linux-kernel, H. Peter Anvin; +Cc: stable

Hello all,

I grabbed 3.7-rc2 and found the following on boot:
PANIC: early exception 08 rip 246:10 error 81441d7f cr2 0

A git bisect says that this problems came from:
1bbbbe779aabe1f0768c2bf8f8c0a5583679b54a is the first bad commit
commit 1bbbbe779aabe1f0768c2bf8f8c0a5583679b54a
Author: Jacob Shin <jacob.shin@amd.com>
Date:   Thu Oct 20 16:15:26 2011 -0500

    x86: Exclude E820_RESERVED regions and memory holes above 4 GB from direct mapping.
    
    On systems with very large memory (1 TB in our case), BIOS may report a
    reserved region or a hole in the E820 map, even above the 4 GB range. Exclude
    these from the direct mapping.


The box in question is an Asus motherboard with AMD Phenom(tm) II X6
1100T and 16GB memory.  Happy to provide any other information required.

-- 
Tom

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2012-10-28 20:48 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <903a3ead-98b5-4afa-88a4-3dc723895e82@blur>
     [not found] ` <d556fc0f-da5d-4531-b331-6dc086461f34@blur>
2012-10-21  0:17   ` BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot Tom Rini
2012-10-21  4:01     ` Yinghai Lu
2012-10-21  4:18       ` Jacob Shin
2012-10-21 17:51         ` Tom Rini
2012-10-21 21:06           ` Jacob Shin
2012-10-21 21:23             ` Tom Rini
2012-10-22 14:40               ` Jacob Shin
2012-10-22 18:05                 ` Yinghai Lu
2012-10-22 18:38                   ` Jacob Shin
2012-10-22 19:46                     ` Yinghai Lu
2012-10-22 20:26                       ` H. Peter Anvin
2012-10-22 20:50                         ` Yinghai Lu
2012-10-22 20:52                           ` H. Peter Anvin
2012-10-22 21:25                             ` Yinghai Lu
2012-10-22 21:27                               ` H. Peter Anvin
2012-10-22 23:35                                 ` Yinghai Lu
2012-10-24 16:48                                   ` Jacob Shin
2012-10-24 18:53                                     ` H. Peter Anvin
2012-10-24 19:53                                       ` Jacob Shin
2012-10-24 21:49                                         ` [tip:x86/urgent] x86, mm: Find_early_table_space based on ranges that are actually being mapped tip-bot for Jacob Shin
2012-10-25  6:42                                           ` Yinghai Lu
2012-10-25  7:55                                             ` Ingo Molnar
2012-10-25 14:33                                               ` Yinghai Lu
2012-10-25 22:23                                                 ` Jacob Shin
2012-10-25 23:31                                                 ` [tip:x86/urgent] x86, mm: Undo incorrect revert in arch/x86/mm/ init.c tip-bot for Yinghai Lu
2012-10-24 19:01                                   ` [tip:x86/urgent] x86, mm: Trim memory in memblock to be page aligned tip-bot for Yinghai Lu
2012-10-24 19:02                                   ` [tip:x86/urgent] x86, mm: Use memblock memory loop instead of e820_RAM tip-bot for Yinghai Lu
2012-10-22 21:00                           ` BUG: 1bbbbe7 (x86: Exclude E820_RESERVED regions...) PANIC on boot H. Peter Anvin
2012-10-22 21:06                             ` Yinghai Lu
2012-10-28 20:48                 ` Tom Rini
2012-10-21 17:52       ` Tom Rini
2012-10-21  0:06 Tom Rini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.