All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Bug#584846: Detects only 64MB and fails to boot on Intel Green City board if e820 hooked by GRUB2
       [not found] <20100612060322.29053.94187.reportbug@feather>
@ 2010-06-12 13:58 ` Ben Hutchings
  2010-06-12 18:28   ` H. Peter Anvin
  0 siblings, 1 reply; 15+ messages in thread
From: Ben Hutchings @ 2010-06-12 13:58 UTC (permalink / raw)
  To: H. Peter Anvin, x86; +Cc: Josh Triplett, 584846, LKML

[-- Attachment #1: Type: text/plain, Size: 2465 bytes --]

Josh Triplett reported this problem with memory sizing:

On Fri, 2010-06-11 at 23:03 -0700, Josh Triplett wrote:
> Package: linux-2.6
> Severity: normal
> 
> I managed to reproduce the problem using stock upstream kernels and
> defconfig, and with defconfig (and no initramfs) the kernel managed to
> use little enough memory that it booted successfully with <64MB of RAM.
> 
> Investigating, I found that Linux decided not to use e820, and instead
> decided to use the older BIOS function 0x88, which cannot report more
> than 64MB of RAM.
> 
> With some investigation and bisection, I managed to track the problem
> down to the following commit:
> 
> commit c549e71d073a6e9a4847497344db28a784061455
> Author: H. Peter Anvin <hpa@zytor.com>
> Date:   Sat Mar 28 13:53:26 2009 -0700
> 
>     x86, setup: ACPI 3, BIOS workaround for E820-probing code
> 
>     Impact: ACPI 3 spec compliance, BIOS bug workaround
> 
>     The ACPI 3 spec added another field to the E820 buffer -- which is
>     backwards incompatible, since it contains a validity bit.
>     Furthermore, there has been at least one report of a BIOS which
>     assumes that the buffer it is pointed at is the same buffer as for the
>     previous E820 call.  Therefore, read the data into a temporary buffer
>     and copy the standard part of it if and only if the valid bit is set.
> 
>     Signed-off-by: H. Peter Anvin <hpa@zytor.com>
> 
> 
> A kernel built from c549e71d073a6e9a4847497344db28a784061455 finds <64MB
> of RAM; a kernel built from c549e71d073a6e9a4847497344db28a784061455^
> successfully finds all 4GB of RAM.
> 
> Also note that newer upstream kernels, including v2.6.35-rc3, fail as
> well.  Since later kernels revert part of the above commit, the issue
> must lie with the parts of the commit not reverted.
> 
> And, again, I can reproduce this using the stock upstream GRUB2 1.98
> release built from source, by booting it from a USB key, and then
> booting the disk MBR via:
> 
> set root=(hd1)
> drivemap (hd1) (hd0)
> chainloader +1
> boot
> 
> 
> Nothing special about drivemap here; anything that uses grub's mmap
> module to reserve memory via e820 (GRUB_MACHINE_MEMORY_RESERVED) will
> cause grub to hook e820 and trigger this bug.  However, in stock grub,
> only drivemap does this.
> 
> - Josh Triplett
> 
> 
> 

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Bug#584846: Detects only 64MB and fails to boot on Intel Green City board if e820 hooked by GRUB2
  2010-06-12 13:58 ` Bug#584846: Detects only 64MB and fails to boot on Intel Green City board if e820 hooked by GRUB2 Ben Hutchings
@ 2010-06-12 18:28   ` H. Peter Anvin
  2010-06-12 18:55     ` Josh Triplett
  0 siblings, 1 reply; 15+ messages in thread
From: H. Peter Anvin @ 2010-06-12 18:28 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: x86, Josh Triplett, 584846, LKML

On 06/12/2010 06:58 AM, Ben Hutchings wrote:
> Josh Triplett reported this problem with memory sizing:
> 
>>
>> A kernel built from c549e71d073a6e9a4847497344db28a784061455 finds <64MB
>> of RAM; a kernel built from c549e71d073a6e9a4847497344db28a784061455^
>> successfully finds all 4GB of RAM.
>>
>> Also note that newer upstream kernels, including v2.6.35-rc3, fail as
>> well.  Since later kernels revert part of the above commit, the issue
>> must lie with the parts of the commit not reverted.
>>
>> And, again, I can reproduce this using the stock upstream GRUB2 1.98
>> release built from source, by booting it from a USB key, and then
>> booting the disk MBR via:
>>
>> set root=(hd1)
>> drivemap (hd1) (hd0)
>> chainloader +1
>> boot
>>
>> Nothing special about drivemap here; anything that uses grub's mmap
>> module to reserve memory via e820 (GRUB_MACHINE_MEMORY_RESERVED) will
>> cause grub to hook e820 and trigger this bug.  However, in stock grub,
>> only drivemap does this.
>>

It's kind of hard to know what is involved, since clearly it relates to
Grub2, which -- how do I say this politely -- seems to excel at doing
things in the most inferior way possible.  This is a great example of that.

The most likely reason it fails is because Grub2 uses ACPI 3-style reads
of the board memory map, gets wrong results for the same reasons the
kernel do, and then pass then downstream to the kernel.  As such, there
is absolutely nothing the kernel can do about it.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Bug#584846: Detects only 64MB and fails to boot on Intel Green City board if e820 hooked by GRUB2
  2010-06-12 18:28   ` H. Peter Anvin
@ 2010-06-12 18:55     ` Josh Triplett
  2010-06-12 20:41       ` H. Peter Anvin
  0 siblings, 1 reply; 15+ messages in thread
From: Josh Triplett @ 2010-06-12 18:55 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Ben Hutchings, x86, 584846, LKML

On Sat, Jun 12, 2010 at 11:28:55AM -0700, H. Peter Anvin wrote:
> On 06/12/2010 06:58 AM, Ben Hutchings wrote:
> > Josh Triplett reported this problem with memory sizing:
> > 
> >>
> >> A kernel built from c549e71d073a6e9a4847497344db28a784061455 finds <64MB
> >> of RAM; a kernel built from c549e71d073a6e9a4847497344db28a784061455^
> >> successfully finds all 4GB of RAM.
> >>
> >> Also note that newer upstream kernels, including v2.6.35-rc3, fail as
> >> well.  Since later kernels revert part of the above commit, the issue
> >> must lie with the parts of the commit not reverted.
> >>
> >> And, again, I can reproduce this using the stock upstream GRUB2 1.98
> >> release built from source, by booting it from a USB key, and then
> >> booting the disk MBR via:
> >>
> >> set root=(hd1)
> >> drivemap (hd1) (hd0)
> >> chainloader +1
> >> boot
> >>
> >> Nothing special about drivemap here; anything that uses grub's mmap
> >> module to reserve memory via e820 (GRUB_MACHINE_MEMORY_RESERVED) will
> >> cause grub to hook e820 and trigger this bug.  However, in stock grub,
> >> only drivemap does this.
> >>
> 
> It's kind of hard to know what is involved, since clearly it relates to
> Grub2, which -- how do I say this politely -- seems to excel at doing
> things in the most inferior way possible.  This is a great example of that.
> 
> The most likely reason it fails is because Grub2 uses ACPI 3-style reads
> of the board memory map, gets wrong results for the same reasons the
> kernel do, and then pass then downstream to the kernel.  As such, there
> is absolutely nothing the kernel can do about it.

grub2 doesn't do ACPI 3 reads; it always asks for 20 bytes, not 24.

Also, note that it works with older Linux kernels (before the commit in
question) and fails with newer ones.  That doesn't rule out the
possibility of a grub bug instead of a Linux bug, but since older Linux
somehow coped with the situation, it seems like a regression that newer
Linux cannot cope.

- Josh Triplett

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Bug#584846: Detects only 64MB and fails to boot on Intel Green City board if e820 hooked by GRUB2
  2010-06-12 18:55     ` Josh Triplett
@ 2010-06-12 20:41       ` H. Peter Anvin
  2010-06-12 21:45         ` Josh Triplett
       [not found]         ` <20100612222634.GA1785@feather>
  0 siblings, 2 replies; 15+ messages in thread
From: H. Peter Anvin @ 2010-06-12 20:41 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Ben Hutchings, x86, 584846, LKML

On 06/12/2010 11:55 AM, Josh Triplett wrote:
>>
>> It's kind of hard to know what is involved, since clearly it relates to
>> Grub2, which -- how do I say this politely -- seems to excel at doing
>> things in the most inferior way possible.  This is a great example of that.
>>
>> The most likely reason it fails is because Grub2 uses ACPI 3-style reads
>> of the board memory map, gets wrong results for the same reasons the
>> kernel do, and then pass then downstream to the kernel.  As such, there
>> is absolutely nothing the kernel can do about it.
> 
> grub2 doesn't do ACPI 3 reads; it always asks for 20 bytes, not 24.
> 
> Also, note that it works with older Linux kernels (before the commit in
> question) and fails with newer ones.  That doesn't rule out the
> possibility of a grub bug instead of a Linux bug, but since older Linux
> somehow coped with the situation, it seems like a regression that newer
> Linux cannot cope.
> 

It's a regression of sorts, sure; but the new Linux code also boots on
real hardware which it didn't boot before.  Since this requires Grub2
plus specific hardware, it is hard for me to track down what the problem
might be, but a good step on the way might be to use the Grub2 boot
procedure (with the drive remapping) to chainboot Syslinux, and run
meminfo.c32 which is a memory report debugging tool; it might be able to
give some answers at least.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Bug#584846: Detects only 64MB and fails to boot on Intel Green City board if e820 hooked by GRUB2
  2010-06-12 20:41       ` H. Peter Anvin
@ 2010-06-12 21:45         ` Josh Triplett
       [not found]         ` <20100612222634.GA1785@feather>
  1 sibling, 0 replies; 15+ messages in thread
From: Josh Triplett @ 2010-06-12 21:45 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Ben Hutchings, x86, 584846, LKML

On Sat, Jun 12, 2010 at 01:41:38PM -0700, H. Peter Anvin wrote:
> On 06/12/2010 11:55 AM, Josh Triplett wrote:
> >>
> >> It's kind of hard to know what is involved, since clearly it relates to
> >> Grub2, which -- how do I say this politely -- seems to excel at doing
> >> things in the most inferior way possible.  This is a great example of that.
> >>
> >> The most likely reason it fails is because Grub2 uses ACPI 3-style reads
> >> of the board memory map, gets wrong results for the same reasons the
> >> kernel do, and then pass then downstream to the kernel.  As such, there
> >> is absolutely nothing the kernel can do about it.
> > 
> > grub2 doesn't do ACPI 3 reads; it always asks for 20 bytes, not 24.
> > 
> > Also, note that it works with older Linux kernels (before the commit in
> > question) and fails with newer ones.  That doesn't rule out the
> > possibility of a grub bug instead of a Linux bug, but since older Linux
> > somehow coped with the situation, it seems like a regression that newer
> > Linux cannot cope.
> > 
> 
> It's a regression of sorts, sure; but the new Linux code also boots on
> real hardware which it didn't boot before.  Since this requires Grub2
> plus specific hardware, it is hard for me to track down what the problem
> might be, but a good step on the way might be to use the Grub2 boot
> procedure (with the drive remapping) to chainboot Syslinux, and run
> meminfo.c32 which is a memory report debugging tool; it might be able to
> give some answers at least.

Will do.

- Josh Triplett

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Bug#584846: Detects only 64MB and fails to boot on Intel Green City board if e820 hooked by GRUB2
       [not found]         ` <20100612222634.GA1785@feather>
@ 2010-06-12 23:01           ` H. Peter Anvin
  2010-06-12 23:02           ` H. Peter Anvin
  1 sibling, 0 replies; 15+ messages in thread
From: H. Peter Anvin @ 2010-06-12 23:01 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Ben Hutchings, x86, 584846, LKML

On 06/12/2010 03:26 PM, Josh Triplett wrote:
> 
> Done.  I've attached the output of meminfo with the e820 hook as
> meminfo-grub-hooked.jpg, and without the e820 hook as
> meminfo-unhooked.jpg.
> 
> Everything looks identical except for the region GRUB hooked right below
> the first reserved region; the unhooked version has available memory
> from 0-0x9cbf0, and the hooked version has available memory from
> 0-0x9cba0, then reserved from 0x9cba0-0x9cbec, then 4 bytes of available
> memory, and then the same reserved region as before.
> 

The new reserved area that Grub hooks is located inside a FBM ("DOS
RAM") reserved area, so Grub is somehow using memory that someone else
has already reserved!  The normal thing is to reserve something in FBM
and not in INT 15h if it is to be reserved only until the protected-mode
operating system starts, but in this case Grub puts something in there
which is a real-mode hook.  It *will* have overwritten something at this
point, the question is just what (and I have no idea how to find that out.)

Note that the unmodified entry conditions (unhooked) has FBM quite a bit
lower than the reserved area.  Something else that really confuses me is
that although the memory map has changed, the INT 15h vector itself is
still the same, so I'm really confused about how the actual hooking
happens in the first place...

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Bug#584846: Detects only 64MB and fails to boot on Intel Green City board if e820 hooked by GRUB2
       [not found]         ` <20100612222634.GA1785@feather>
  2010-06-12 23:01           ` H. Peter Anvin
@ 2010-06-12 23:02           ` H. Peter Anvin
  2010-06-13  0:07             ` Josh Triplett
  1 sibling, 1 reply; 15+ messages in thread
From: H. Peter Anvin @ 2010-06-12 23:02 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Ben Hutchings, x86, 584846, LKML

On 06/12/2010 03:26 PM, Josh Triplett wrote:
> 
> Everything looks identical except for the region GRUB hooked right below
> the first reserved region; the unhooked version has available memory
> from 0-0x9cbf0, and the hooked version has available memory from
> 0-0x9cba0, then reserved from 0x9cba0-0x9cbec, then 4 bytes of available
> memory, and then the same reserved region as before.
> 

Actually... are both these done by chainloading Grub (with and without
mapping), or is the unhooked done without chainloading Grub at all?

To me it looks like something is chaining INT 15h even in the
"unchained" case...

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Bug#584846: Detects only 64MB and fails to boot on Intel Green City board if e820 hooked by GRUB2
  2010-06-12 23:02           ` H. Peter Anvin
@ 2010-06-13  0:07             ` Josh Triplett
  2010-06-13  0:16               ` H. Peter Anvin
  0 siblings, 1 reply; 15+ messages in thread
From: Josh Triplett @ 2010-06-13  0:07 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Ben Hutchings, x86, 584846, LKML

On Sat, Jun 12, 2010 at 04:02:44PM -0700, H. Peter Anvin wrote:
> On 06/12/2010 03:26 PM, Josh Triplett wrote:
> > 
> > Everything looks identical except for the region GRUB hooked right below
> > the first reserved region; the unhooked version has available memory
> > from 0-0x9cbf0, and the hooked version has available memory from
> > 0-0x9cba0, then reserved from 0x9cba0-0x9cbec, then 4 bytes of available
> > memory, and then the same reserved region as before.
> 
> Actually... are both these done by chainloading Grub (with and without
> mapping), or is the unhooked done without chainloading Grub at all?
> 
> To me it looks like something is chaining INT 15h even in the
> "unchained" case...

The "unhooked" case still chainloaded from GRUB, just without calling
drivemap and thus without hooking anything.  I can test without
chainloading from GRUB, though to the best of my knowledge GRUB doesn't
hook int 15 unless it needs to intercept e820 (and e801 and 88).

- Josh Triplett

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Bug#584846: Detects only 64MB and fails to boot on Intel Green City board if e820 hooked by GRUB2
  2010-06-13  0:07             ` Josh Triplett
@ 2010-06-13  0:16               ` H. Peter Anvin
       [not found]                 ` <20100622052236.GA9130@feather>
  0 siblings, 1 reply; 15+ messages in thread
From: H. Peter Anvin @ 2010-06-13  0:16 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Ben Hutchings, x86, 584846, LKML

On 06/12/2010 05:07 PM, Josh Triplett wrote:
> On Sat, Jun 12, 2010 at 04:02:44PM -0700, H. Peter Anvin wrote:
>> On 06/12/2010 03:26 PM, Josh Triplett wrote:
>>>
>>> Everything looks identical except for the region GRUB hooked right below
>>> the first reserved region; the unhooked version has available memory
>>> from 0-0x9cbf0, and the hooked version has available memory from
>>> 0-0x9cba0, then reserved from 0x9cba0-0x9cbec, then 4 bytes of available
>>> memory, and then the same reserved region as before.
>>
>> Actually... are both these done by chainloading Grub (with and without
>> mapping), or is the unhooked done without chainloading Grub at all?
>>
>> To me it looks like something is chaining INT 15h even in the
>> "unchained" case...
> 
> The "unhooked" case still chainloaded from GRUB, just without calling
> drivemap and thus without hooking anything.  I can test without
> chainloading from GRUB, though to the best of my knowledge GRUB doesn't
> hook int 15 unless it needs to intercept e820 (and e801 and 88).
> 
> - Josh Triplett

Well *something* is... and it might not be Grub but one of the expansion
ROMs.  If so, the problem is probably Grub stepping on the expansion ROM
by not honoring FBM.

	-hpa


-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Bug#584846: Detects only 64MB and fails to boot on Intel Green City board if e820 hooked by GRUB2
       [not found]                 ` <20100622052236.GA9130@feather>
@ 2010-06-22  6:07                   ` H. Peter Anvin
  2010-06-22 16:07                     ` Josh Triplett
  2010-06-24  7:27                     ` Josh Triplett
  0 siblings, 2 replies; 15+ messages in thread
From: H. Peter Anvin @ 2010-06-22  6:07 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Ben Hutchings, x86, 584846, LKML

On 06/21/2010 10:22 PM, Josh Triplett wrote:
> 
> How might I diagnose this further?  What might cause Linux to refuse to
> use the e820 and e801 results provided by GRUB, but accept the ones
> provided by the BIOS?
> 

This is interesting... you apparently have a ACPI 3-style e820 BIOS as
evidenced by the [1] markers, but Grub presents it as legacy style.
Now, the kernel shouldn't care, but this at least gives a clue.

Something that might be worthwhile is to add printf's to the kernel's
e820-parsing routine (in arch/x86/boot/e820.c) and figure out why it
doesn't like the output.  It's a bit strange that meminfo would produce
sensible-looking output (well, legal, at least; presenting a two-byte
range is rather beyond crazy, and so forth) and the kernel wouldn't
accept it, as the code is intentionally very similar.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Bug#584846: Detects only 64MB and fails to boot on Intel Green City board if e820 hooked by GRUB2
  2010-06-22  6:07                   ` H. Peter Anvin
@ 2010-06-22 16:07                     ` Josh Triplett
  2010-06-24  7:27                     ` Josh Triplett
  1 sibling, 0 replies; 15+ messages in thread
From: Josh Triplett @ 2010-06-22 16:07 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Ben Hutchings, x86, 584846, LKML

On Mon, Jun 21, 2010 at 11:07:31PM -0700, H. Peter Anvin wrote:
> On 06/21/2010 10:22 PM, Josh Triplett wrote:
> > 
> > How might I diagnose this further?  What might cause Linux to refuse to
> > use the e820 and e801 results provided by GRUB, but accept the ones
> > provided by the BIOS?
> > 
> 
> This is interesting... you apparently have a ACPI 3-style e820 BIOS as
> evidenced by the [1] markers, but Grub presents it as legacy style.

Interesting!  I didn't know until now that that system supported the
ACPI 3-style e820.  (Looks like it doesn't actually use that mechanism
to disable any regions, though, fortunately.)

And yes, GRUB always provides 20-byte e820 entries, never larger,
regardless of what the caller asks for.

> Now, the kernel shouldn't care, but this at least gives a clue.
> 
> Something that might be worthwhile is to add printf's to the kernel's
> e820-parsing routine (in arch/x86/boot/e820.c) and figure out why it
> doesn't like the output.  It's a bit strange that meminfo would produce
> sensible-looking output (well, legal, at least; presenting a two-byte
> range is rather beyond crazy, and so forth) and the kernel wouldn't
> accept it, as the code is intentionally very similar.

Do you mean arch/x86/kernel/e820.c ?  OK, will do.

In the failure case Linux prints the memory map as "BIOS-88".  It looks
like that can only happen if the call to append_e820_map fails.  That in
turn looks like it can only happen if the number of map entries becomes
less than 2 (and since it shouldn't start that way that would have to
happen in sanitize_e820_map), or if 64-bit overflow occurred in the
memory map when adding the start and size fields.

I'll investigate further and try to figure out exactly what caused Linux
to refuse to use the e820 map.

- Josh Triplett

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Bug#584846: Detects only 64MB and fails to boot on Intel Green City board if e820 hooked by GRUB2
  2010-06-22  6:07                   ` H. Peter Anvin
  2010-06-22 16:07                     ` Josh Triplett
@ 2010-06-24  7:27                     ` Josh Triplett
  2010-06-24 14:18                       ` H. Peter Anvin
  1 sibling, 1 reply; 15+ messages in thread
From: Josh Triplett @ 2010-06-24  7:27 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Ben Hutchings, x86, 584846, LKML

On Mon, Jun 21, 2010 at 11:07:31PM -0700, H. Peter Anvin wrote:
> On 06/21/2010 10:22 PM, Josh Triplett wrote:
> > 
> > How might I diagnose this further?  What might cause Linux to refuse to
> > use the e820 and e801 results provided by GRUB, but accept the ones
> > provided by the BIOS?
> > 
> 
> This is interesting... you apparently have a ACPI 3-style e820 BIOS as
> evidenced by the [1] markers, but Grub presents it as legacy style.
> Now, the kernel shouldn't care, but this at least gives a clue.
> 
> Something that might be worthwhile is to add printf's to the kernel's
> e820-parsing routine (in arch/x86/boot/e820.c) and figure out why it
> doesn't like the output.  It's a bit strange that meminfo would produce
> sensible-looking output (well, legal, at least; presenting a two-byte
> range is rather beyond crazy, and so forth) and the kernel wouldn't
> accept it, as the code is intentionally very similar.

OK, I managed to track down the problem to a bug in GRUB's int15 hook
code, which older Linux kernels didn't run into.

GRUB's int15 hook, when it returned, would stc or clc as appropriate,
and then iret, replacing the carry flag it set with the original flags
set on entry to int15.  More recent Linux kernels had CF=1 on entry to
the int15 hook, so GRUB's iret left CF=1, and detect_memory_e820 would
treat that as the end of the e820 map.

This same problem applies to the e801 and 88 handlers, likely triggering
the error case in detect_memory_e801 as well.  detect_memory_88 doesn't
actually check CF, though.

(Fun debugging trick: in detect_memory_e820, since I couldn't call
printk and I wanted to print something that would get preserved in
dmesg, I stashed debug values in boot_params._pad9 and then printk'd
them from default_machine_specific_memory_setup.)

The following patch fixes GRUB; with this patch, I can reserve memory
(such as with drivemap), boot 2.6.35-rc3 successfully, and it detects
all of my RAM.


=== modified file 'mmap/i386/pc/mmap_helper.S'
--- mmap/i386/pc/mmap_helper.S	2010-03-26 23:04:14 +0000
+++ mmap/i386/pc/mmap_helper.S	2010-06-24 06:54:54 +0000
@@ -59,7 +59,7 @@
 	movw %bx, %dx
 	pop %ds
 	clc
-	iret
+	lret $2
 
 LOCAL (h88):
 	popf
@@ -69,7 +69,7 @@
 	movw DS (LOCAL (kbin16mb)), %ax
 	pop %ds
 	clc
-	iret
+	lret $2
 
 LOCAL (e820):
 	popf
@@ -101,13 +101,13 @@
 	mov $0x534d4150, %eax
 	pop %ds
 	clc
-	iret
+	lret $2
 LOCAL (errexit):
 	mov $0x534d4150, %eax
 	pop %ds
+	xor %bx, %bx
 	stc
-	xor %bx, %bx
-	iret
+	lret $2
 
 VARIABLE(grub_machine_mmaphook_mmap_num)
 LOCAL (mmap_num):


I don't see any trivial way Linux could work around this bug.  If the
e820 call left CF=0 on entry, then the error case would get incorrectly
treated as a valid e820 entry (albeit a final one, since bx=0).

- Josh Triplett

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Bug#584846: Detects only 64MB and fails to boot on Intel Green City board if e820 hooked by GRUB2
  2010-06-24  7:27                     ` Josh Triplett
@ 2010-06-24 14:18                       ` H. Peter Anvin
  2010-06-24 19:01                         ` Josh Triplett
  0 siblings, 1 reply; 15+ messages in thread
From: H. Peter Anvin @ 2010-06-24 14:18 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Ben Hutchings, x86, 584846, LKML

On 06/24/2010 12:27 AM, Josh Triplett wrote:
> 
> The following patch fixes GRUB; with this patch, I can reserve memory
> (such as with drivemap), boot 2.6.35-rc3 successfully, and it detects
> all of my RAM.
> 

Congratulations!  You have just committed the single most common BIOS
implementation bug.  (Sorry for the sarcasm, but this seems to be a bug
that almost everyone who tries to implement BIOS makes at one point or
another... even the original IBM BIOS had it in at least one place.)

You *must not* use "lret $2" to return to the caller, because the INT
instruction will have cleared IF after pushing the registers to the
stack.  You have to restore the original IF, which "lret $2" will not do.

The best way to do this is to clobber the low byte of the flags register
on the stack.  Since CF is bit 0, and the low byte only contains
arithmetic flags anyway, you can simply overwrite the low byte with 0
for CF=0 and 1 for CF=1.  This will zero SF, ZF, AF and PF as side
effect, which is OK for almost all uses (including e820/e801/88.)

If you don't already have a pointer to the stack, you have to make one,
since it is not possible in 16-bit mode to access the stack directly.
One option is to replace each iret with a jump to the following common code:

carry_cf_iret:
	pushw	%bp
	movw	%sp, %bp
	setc	6(%bp)		/* Set CF on stack based on EFLAGS */
	popw	%bp
	iret

> 
> I don't see any trivial way Linux could work around this bug.  If the
> e820 call left CF=0 on entry, then the error case would get incorrectly
> treated as a valid e820 entry (albeit a final one, since bx=0).
> 

More importantly, it makes the error direction the wrong one.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Bug#584846: Detects only 64MB and fails to boot on Intel Green City board if e820 hooked by GRUB2
  2010-06-24 14:18                       ` H. Peter Anvin
@ 2010-06-24 19:01                         ` Josh Triplett
  2010-06-24 20:58                           ` H. Peter Anvin
  0 siblings, 1 reply; 15+ messages in thread
From: Josh Triplett @ 2010-06-24 19:01 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Ben Hutchings, x86, 584846, LKML

On Thu, Jun 24, 2010 at 07:18:34AM -0700, H. Peter Anvin wrote:
> On 06/24/2010 12:27 AM, Josh Triplett wrote:
> > The following patch fixes GRUB; with this patch, I can reserve memory
> > (such as with drivemap), boot 2.6.35-rc3 successfully, and it detects
> > all of my RAM.
> 
> Congratulations!  You have just committed the single most common BIOS
> implementation bug.  (Sorry for the sarcasm, but this seems to be a bug
> that almost everyone who tries to implement BIOS makes at one point or
> another... even the original IBM BIOS had it in at least one place.)

And a rather large number of sample interrupt code found on the web,
including the e820 hook from the version of gPXE/Etherboot that I used
as an example. :)  Given that I just tested against Linux, which very
carefully works around that particular BIOS bug, I didn't run into any
issue.

So, how high does GRUB's bug ("stc ; iret"/"clc ; iret") rank on the
list of common BIOS implementation bugs?

> You *must not* use "lret $2" to return to the caller, because the INT
> instruction will have cleared IF after pushing the registers to the
> stack.  You have to restore the original IF, which "lret $2" will not do.

The thought had crossed my mind to preserve the caller's flags, but I'd
ignored it because I'd figured that the interrupt handler could safely
trash the caller's flags as long as it set or cleared carry
appropriately.  I'd forgotten that IF lives there too. :)

> The best way to do this is to clobber the low byte of the flags register
> on the stack.  Since CF is bit 0, and the low byte only contains
> arithmetic flags anyway, you can simply overwrite the low byte with 0
> for CF=0 and 1 for CF=1.  This will zero SF, ZF, AF and PF as side
> effect, which is OK for almost all uses (including e820/e801/88.)
> 
> If you don't already have a pointer to the stack, you have to make one,
> since it is not possible in 16-bit mode to access the stack directly.
> One option is to replace each iret with a jump to the following common code:
> 
> carry_cf_iret:
> 	pushw	%bp
> 	movw	%sp, %bp
> 	setc	6(%bp)		/* Set CF on stack based on EFLAGS */
> 	popw	%bp
> 	iret

Nice.  Cleaner than the andw/orw solution I'd thought of using (and
actually written before dropping it in favor of "lret $2") to
specifically clear/set CF on the stack, since it doesn't require
separate exit paths for success and failure.

New patch:

--- mmap/i386/pc/mmap_helper.S	2010-03-26 23:04:14 +0000
+++ mmap/i386/pc/mmap_helper.S	2010-06-24 18:52:56 +0000
@@ -59,7 +59,7 @@
 	movw %bx, %dx
 	pop %ds
 	clc
-	iret
+	jmp LOCAL (iret_cf)
 
 LOCAL (h88):
 	popf
@@ -69,7 +69,7 @@
 	movw DS (LOCAL (kbin16mb)), %ax
 	pop %ds
 	clc
-	iret
+	jmp LOCAL (iret_cf)
 
 LOCAL (e820):
 	popf
@@ -101,12 +101,19 @@
 	mov $0x534d4150, %eax
 	pop %ds
 	clc
-	iret
+	jmp LOCAL (iret_cf)
 LOCAL (errexit):
 	mov $0x534d4150, %eax
 	pop %ds
+	xor %bx, %bx
 	stc
-	xor %bx, %bx
+	jmp LOCAL (iret_cf)
+
+LOCAL (iret_cf):
+	pushw %bp
+	movw %sp, %bp
+	setc 6(%bp)
+	popw %bp
 	iret
 
 VARIABLE(grub_machine_mmaphook_mmap_num)


- Josh Triplett

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Bug#584846: Detects only 64MB and fails to boot on Intel Green City board if e820 hooked by GRUB2
  2010-06-24 19:01                         ` Josh Triplett
@ 2010-06-24 20:58                           ` H. Peter Anvin
  0 siblings, 0 replies; 15+ messages in thread
From: H. Peter Anvin @ 2010-06-24 20:58 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Ben Hutchings, x86, 584846, LKML

On 06/24/2010 12:01 PM, Josh Triplett wrote:
> On Thu, Jun 24, 2010 at 07:18:34AM -0700, H. Peter Anvin wrote:
>> On 06/24/2010 12:27 AM, Josh Triplett wrote:
>>> The following patch fixes GRUB; with this patch, I can reserve memory
>>> (such as with drivemap), boot 2.6.35-rc3 successfully, and it detects
>>> all of my RAM.
>>
>> Congratulations!  You have just committed the single most common BIOS
>> implementation bug.  (Sorry for the sarcasm, but this seems to be a bug
>> that almost everyone who tries to implement BIOS makes at one point or
>> another... even the original IBM BIOS had it in at least one place.)
> 
> And a rather large number of sample interrupt code found on the web,
> including the e820 hook from the version of gPXE/Etherboot that I used
> as an example. :)  Given that I just tested against Linux, which very
> carefully works around that particular BIOS bug, I didn't run into any
> issue.
> 
> So, how high does GRUB's bug ("stc ; iret"/"clc ; iret") rank on the
> list of common BIOS implementation bugs?
> 

Less common, since that one is apparently more obvious to people (you
only have to think one step ahead instead of two steps ahead.)

There is a reason Linux works around this and similar bugs... it truly
is extremely common (and does cause real problems in real code.)

	-hpa


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2010-06-24 20:58 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20100612060322.29053.94187.reportbug@feather>
2010-06-12 13:58 ` Bug#584846: Detects only 64MB and fails to boot on Intel Green City board if e820 hooked by GRUB2 Ben Hutchings
2010-06-12 18:28   ` H. Peter Anvin
2010-06-12 18:55     ` Josh Triplett
2010-06-12 20:41       ` H. Peter Anvin
2010-06-12 21:45         ` Josh Triplett
     [not found]         ` <20100612222634.GA1785@feather>
2010-06-12 23:01           ` H. Peter Anvin
2010-06-12 23:02           ` H. Peter Anvin
2010-06-13  0:07             ` Josh Triplett
2010-06-13  0:16               ` H. Peter Anvin
     [not found]                 ` <20100622052236.GA9130@feather>
2010-06-22  6:07                   ` H. Peter Anvin
2010-06-22 16:07                     ` Josh Triplett
2010-06-24  7:27                     ` Josh Triplett
2010-06-24 14:18                       ` H. Peter Anvin
2010-06-24 19:01                         ` Josh Triplett
2010-06-24 20:58                           ` H. Peter Anvin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.