All of lore.kernel.org
 help / color / mirror / Atom feed
* Why does memblock only refer to E820 table and not EFI Memory Map?
@ 2019-07-20 22:52 ` Sai Praneeth Prakhya
  0 siblings, 0 replies; 7+ messages in thread
From: Sai Praneeth Prakhya @ 2019-07-20 22:52 UTC (permalink / raw)
  To: linux-mm, linux-efi; +Cc: mingo, bp, peterz, ard.biesheuvel, rppt, pj

Hi All,

Disclaimer:
1. Please note that this discussion is x86 specific
2. Below stated things are my understanding about kernel and I could have
missed somethings, so please let me know if I understood something wrong.
3. I have focused only on memblock here because if I understand correctly,
memblock is the base that feeds other memory management subsystems in kernel
(like the buddy allocator).

On x86 platforms, there are two sources through which kernel learns about
physical memory in the system namely E820 table and EFI Memory Map. Each table
describes which regions of system memory is usable by kernel and which regions
should be preserved (i.e. reserved regions that typically have BIOS code/data)
so that no other component in the system could read/write to these regions. I
think they are duplicating the information and hence I have couple of
questions regarding these

1. I see that only E820 table is being consumed by kernel [1] (i.e. memblock
subsystem in kernel) to distinguish between "usable" vs "reserved" regions.
Assume someone has called memblock_alloc(), the memblock subsystem would
service the caller by allocating memory from "usable" regions and it knows
this *only* from E820 table [2] (it does not check if EFI Memory Map also says
that this region is usable as well). So, why isn't the kernel taking EFI
Memory Map into consideration? (I see that it does happen only when
"add_efi_memmap" kernel command line arg is passed i.e. passing this argument
updates E820 table based on EFI Memory Map) [3]. The problem I see with
memblock not taking EFI Memory Map into consideration is that, we are ignoring
the main purpose for which EFI Memory Map exists.

2. Why doesn't the kernel have "add_efi_memmap" by default? From the commit
"200001eb140e: x86 boot: only pick up additional EFI memmap if add_efi_memmap
flag", I didn't understand why the decision was made so. Shouldn't we give
more preference to EFI Memory map rather than E820 table as it's the latest
and E820 is legacy?

3. Why isn't kernel checking that both the tables E820 table and EFI Memory
Map are in sync i.e. is there any *possibility* that a buggy BIOS could report
a region as usable in E820 table and as reserved in EFI Memory Map?

[1] 
https://elixir.bootlin.com/linux/latest/source/arch/x86/kernel/setup.c#L1106
[2] 
https://elixir.bootlin.com/linux/latest/source/arch/x86/kernel/e820.c#L1265
[3] 
https://elixir.bootlin.com/linux/latest/source/arch/x86/platform/efi/efi.c#L129

Regards,
Sai


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Why does memblock only refer to E820 table and not EFI Memory Map?
@ 2019-07-20 22:52 ` Sai Praneeth Prakhya
  0 siblings, 0 replies; 7+ messages in thread
From: Sai Praneeth Prakhya @ 2019-07-20 22:52 UTC (permalink / raw)
  To: linux-mm, linux-efi; +Cc: mingo, bp, peterz, ard.biesheuvel, rppt, pj

Hi All,

Disclaimer:
1. Please note that this discussion is x86 specific
2. Below stated things are my understanding about kernel and I could have
missed somethings, so please let me know if I understood something wrong.
3. I have focused only on memblock here because if I understand correctly,
memblock is the base that feeds other memory management subsystems in kernel
(like the buddy allocator).

On x86 platforms, there are two sources through which kernel learns about
physical memory in the system namely E820 table and EFI Memory Map. Each table
describes which regions of system memory is usable by kernel and which regions
should be preserved (i.e. reserved regions that typically have BIOS code/data)
so that no other component in the system could read/write to these regions. I
think they are duplicating the information and hence I have couple of
questions regarding these

1. I see that only E820 table is being consumed by kernel [1] (i.e. memblock
subsystem in kernel) to distinguish between "usable" vs "reserved" regions.
Assume someone has called memblock_alloc(), the memblock subsystem would
service the caller by allocating memory from "usable" regions and it knows
this *only* from E820 table [2] (it does not check if EFI Memory Map also says
that this region is usable as well). So, why isn't the kernel taking EFI
Memory Map into consideration? (I see that it does happen only when
"add_efi_memmap" kernel command line arg is passed i.e. passing this argument
updates E820 table based on EFI Memory Map) [3]. The problem I see with
memblock not taking EFI Memory Map into consideration is that, we are ignoring
the main purpose for which EFI Memory Map exists.

2. Why doesn't the kernel have "add_efi_memmap" by default? From the commit
"200001eb140e: x86 boot: only pick up additional EFI memmap if add_efi_memmap
flag", I didn't understand why the decision was made so. Shouldn't we give
more preference to EFI Memory map rather than E820 table as it's the latest
and E820 is legacy?

3. Why isn't kernel checking that both the tables E820 table and EFI Memory
Map are in sync i.e. is there any *possibility* that a buggy BIOS could report
a region as usable in E820 table and as reserved in EFI Memory Map?

[1] 
https://elixir.bootlin.com/linux/latest/source/arch/x86/kernel/setup.c#L1106
[2] 
https://elixir.bootlin.com/linux/latest/source/arch/x86/kernel/e820.c#L1265
[3] 
https://elixir.bootlin.com/linux/latest/source/arch/x86/platform/efi/efi.c#L129

Regards,
Sai


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Why does memblock only refer to E820 table and not EFI Memory Map?
  2019-07-20 22:52 ` Sai Praneeth Prakhya
  (?)
@ 2019-07-23  8:09 ` Dave Young
  2019-07-23 17:11   ` Prakhya, Sai Praneeth
  -1 siblings, 1 reply; 7+ messages in thread
From: Dave Young @ 2019-07-23  8:09 UTC (permalink / raw)
  To: Sai Praneeth Prakhya
  Cc: linux-mm, linux-efi, mingo, bp, peterz, ard.biesheuvel, rppt, pj

Hi,
On 07/20/19 at 03:52pm, Sai Praneeth Prakhya wrote:
> Hi All,
> 
> Disclaimer:
> 1. Please note that this discussion is x86 specific
> 2. Below stated things are my understanding about kernel and I could have
> missed somethings, so please let me know if I understood something wrong.
> 3. I have focused only on memblock here because if I understand correctly,
> memblock is the base that feeds other memory management subsystems in kernel
> (like the buddy allocator).
> 
> On x86 platforms, there are two sources through which kernel learns about
> physical memory in the system namely E820 table and EFI Memory Map. Each table
> describes which regions of system memory is usable by kernel and which regions
> should be preserved (i.e. reserved regions that typically have BIOS code/data)
> so that no other component in the system could read/write to these regions. I
> think they are duplicating the information and hence I have couple of
> questions regarding these
> 
> 1. I see that only E820 table is being consumed by kernel [1] (i.e. memblock
> subsystem in kernel) to distinguish between "usable" vs "reserved" regions.
> Assume someone has called memblock_alloc(), the memblock subsystem would
> service the caller by allocating memory from "usable" regions and it knows
> this *only* from E820 table [2] (it does not check if EFI Memory Map also says
> that this region is usable as well). So, why isn't the kernel taking EFI
> Memory Map into consideration? (I see that it does happen only when
> "add_efi_memmap" kernel command line arg is passed i.e. passing this argument
> updates E820 table based on EFI Memory Map) [3]. The problem I see with
> memblock not taking EFI Memory Map into consideration is that, we are ignoring
> the main purpose for which EFI Memory Map exists.

https://blog.fpmurphy.com/2012/08/uefi-memory-v-e820-memory.html
Probably above blog can explain some background.

> 
> 2. Why doesn't the kernel have "add_efi_memmap" by default? From the commit
> "200001eb140e: x86 boot: only pick up additional EFI memmap if add_efi_memmap
> flag", I didn't understand why the decision was made so. Shouldn't we give
> more preference to EFI Memory map rather than E820 table as it's the latest
> and E820 is legacy?
> 
> 3. Why isn't kernel checking that both the tables E820 table and EFI Memory
> Map are in sync i.e. is there any *possibility* that a buggy BIOS could report
> a region as usable in E820 table and as reserved in EFI Memory Map?
> 
> [1] 
> https://elixir.bootlin.com/linux/latest/source/arch/x86/kernel/setup.c#L1106
> [2] 
> https://elixir.bootlin.com/linux/latest/source/arch/x86/kernel/e820.c#L1265
> [3] 
> https://elixir.bootlin.com/linux/latest/source/arch/x86/platform/efi/efi.c#L129
> 
> Regards,
> Sai
> 

Thanks
Dave

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: Why does memblock only refer to E820 table and not EFI Memory Map?
  2019-07-23  8:09 ` Dave Young
@ 2019-07-23 17:11   ` Prakhya, Sai Praneeth
  0 siblings, 0 replies; 7+ messages in thread
From: Prakhya, Sai Praneeth @ 2019-07-23 17:11 UTC (permalink / raw)
  To: Dave Young
  Cc: linux-mm, linux-efi, mingo, bp, peterz, ard.biesheuvel, rppt, pj

> > On x86 platforms, there are two sources through which kernel learns
> > about physical memory in the system namely E820 table and EFI Memory
> > Map. Each table describes which regions of system memory is usable by
> > kernel and which regions should be preserved (i.e. reserved regions
> > that typically have BIOS code/data) so that no other component in the
> > system could read/write to these regions. I think they are duplicating
> > the information and hence I have couple of questions regarding these
> >
> > 1. I see that only E820 table is being consumed by kernel [1] (i.e.
> > memblock subsystem in kernel) to distinguish between "usable" vs "reserved"
> regions.
> > Assume someone has called memblock_alloc(), the memblock subsystem
> > would service the caller by allocating memory from "usable" regions
> > and it knows this *only* from E820 table [2] (it does not check if EFI
> > Memory Map also says that this region is usable as well). So, why
> > isn't the kernel taking EFI Memory Map into consideration? (I see that
> > it does happen only when "add_efi_memmap" kernel command line arg is
> > passed i.e. passing this argument updates E820 table based on EFI
> > Memory Map) [3]. The problem I see with memblock not taking EFI Memory
> > Map into consideration is that, we are ignoring the main purpose for which EFI
> Memory Map exists.
> 
> https://blog.fpmurphy.com/2012/08/uefi-memory-v-e820-memory.html
> Probably above blog can explain some background.

Thanks a lot! Dave. The link was helpful, it did explain that Linus and HPA were 
not very happy with EFI and things were going good with E820 and hence it was given 
more preference compared to EFI.

But sadly, I am not 100% convinced yet :( (just my thoughts)
This decision was made a decade ago when EFI wasn't stable. Now that UEFI is the defacto 
on most of the x86 platforms (and since I believe UEFI is getting better) I am still unable to 
digest that kernel throws away EFI Memory Map (unless explicitly asked by "add_efi_memap")

Regards,
Sai

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Why does memblock only refer to E820 table and not EFI Memory Map?
  2019-07-20 22:52 ` Sai Praneeth Prakhya
  (?)
  (?)
@ 2019-07-23 21:38 ` Ricardo Neri
  2019-07-23 22:01     ` Sai Praneeth Prakhya
  -1 siblings, 1 reply; 7+ messages in thread
From: Ricardo Neri @ 2019-07-23 21:38 UTC (permalink / raw)
  To: Sai Praneeth Prakhya
  Cc: linux-mm, linux-efi, mingo, bp, peterz, ard.biesheuvel, rppt, pj

On Sat, Jul 20, 2019 at 03:52:04PM -0700, Sai Praneeth Prakhya wrote:
> Hi All,
> 
> Disclaimer:
> 1. Please note that this discussion is x86 specific
> 2. Below stated things are my understanding about kernel and I could have
> missed somethings, so please let me know if I understood something wrong.
> 3. I have focused only on memblock here because if I understand correctly,
> memblock is the base that feeds other memory management subsystems in kernel
> (like the buddy allocator).
> 
> On x86 platforms, there are two sources through which kernel learns about
> physical memory in the system namely E820 table and EFI Memory Map. Each table
> describes which regions of system memory is usable by kernel and which regions
> should be preserved (i.e. reserved regions that typically have BIOS code/data)
> so that no other component in the system could read/write to these regions. I
> think they are duplicating the information and hence I have couple of
> questions regarding these

But isn't it true that in x86 systems the E820 table is populated from the EFI
memory map? At least in systems with EFI firmware and a Linux which understands
EFI. If booting from the EFI stub, the stub will take the EFI memory map and
assemble the E820 table passed as part of the boot params [4]. It also considers
the case when there are more than 128 entries in the table [5]. Thus, if booting
as an EFI application it will definitely use the EFI memory map. If Linux' EFI
entry point is not used the bootloader should to the same. For instance, grub
also reads the EFI memory map to assemble the E820 memory map [6], [7], [8].

> 
> 1. I see that only E820 table is being consumed by kernel [1] (i.e. memblock
> subsystem in kernel) to distinguish between "usable" vs "reserved" regions.
> Assume someone has called memblock_alloc(), the memblock subsystem would
> service the caller by allocating memory from "usable" regions and it knows
> this *only* from E820 table [2] (it does not check if EFI Memory Map also says
> that this region is usable as well). So, why isn't the kernel taking EFI
> Memory Map into consideration? (I see that it does happen only when
> "add_efi_memmap" kernel command line arg is passed i.e. passing this argument
> updates E820 table based on EFI Memory Map) [3]. The problem I see with
> memblock not taking EFI Memory Map into consideration is that, we are ignoring
> the main purpose for which EFI Memory Map exists.
> 
> 2. Why doesn't the kernel have "add_efi_memmap" by default? From the commit
> "200001eb140e: x86 boot: only pick up additional EFI memmap if add_efi_memmap
> flag", I didn't understand why the decision was made so. Shouldn't we give
> more preference to EFI Memory map rather than E820 table as it's the latest
> and E820 is legacy?

I did a a quick experiment with and without add_efi_memmmap. the e820
table looked exactly the same. I guess this shows that what I wrote
above makes sense ;) . Have you observed difference?

Thanks and BR,
Ricardo

[4]. https://elixir.bootlin.com/linux/latest/source/arch/x86/boot/compressed/eboot.c#L516
[5]. https://elixir.bootlin.com/linux/latest/source/arch/x86/boot/compressed/eboot.c#L493
[6]. http://git.savannah.gnu.org/cgit/grub.git/tree/grub-core/loader/i386/linux.c#n573
[7]. http://git.savannah.gnu.org/cgit/grub.git/tree/grub-core/mmap/mmap.c#n110
[8]. http://git.savannah.gnu.org/cgit/grub.git/tree/grub-core/mmap/efi/mmap.c#n139
 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Why does memblock only refer to E820 table and not EFI Memory Map?
  2019-07-23 21:38 ` Ricardo Neri
@ 2019-07-23 22:01     ` Sai Praneeth Prakhya
  0 siblings, 0 replies; 7+ messages in thread
From: Sai Praneeth Prakhya @ 2019-07-23 22:01 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: linux-mm, linux-efi, mingo, bp, peterz, ard.biesheuvel, rppt, pj


> > On x86 platforms, there are two sources through which kernel learns about
> > physical memory in the system namely E820 table and EFI Memory Map. Each
> > table
> > describes which regions of system memory is usable by kernel and which
> > regions
> > should be preserved (i.e. reserved regions that typically have BIOS
> > code/data)
> > so that no other component in the system could read/write to these
> > regions. I
> > think they are duplicating the information and hence I have couple of
> > questions regarding these
> 
> But isn't it true that in x86 systems the E820 table is populated from the
> EFI memory map?

I don't know that it happens.. :(

> At least in systems with EFI firmware and a Linux which understands
> EFI. If booting from the EFI stub, the stub will take the EFI memory map and
> assemble the E820 table passed as part of the boot params [4]. It also
> considers the case when there are more than 128 entries in the table [5].
> Thus, if booting as an EFI application it will definitely use the EFI memory
> map. If Linux' EFI entry point is not used the bootloader should to the
> same. For instance, grub also reads the EFI memory map to assemble the E820
> memory map [6], [7], [8].

Thanks a lot! for the pointers Ricardo :)
I haven't looked at EFI stub and Grub code and hence didn't knew this was
happening. It does make me feel better that EFI Memory Map is indeed being
used to generate e820 in EFI stub case, so at-least it's getting consumed
indirectly.

> > 1. I see that only E820 table is being consumed by kernel [1] (i.e.
> > memblock
> > subsystem in kernel) to distinguish between "usable" vs "reserved"
> > regions.
> > Assume someone has called memblock_alloc(), the memblock subsystem would
> > service the caller by allocating memory from "usable" regions and it knows
> > this *only* from E820 table [2] (it does not check if EFI Memory Map also
> > says
> > that this region is usable as well). So, why isn't the kernel taking EFI
> > Memory Map into consideration? (I see that it does happen only when
> > "add_efi_memmap" kernel command line arg is passed i.e. passing this
> > argument
> > updates E820 table based on EFI Memory Map) [3]. The problem I see with
> > memblock not taking EFI Memory Map into consideration is that, we are
> > ignoring
> > the main purpose for which EFI Memory Map exists.
> > 
> > 2. Why doesn't the kernel have "add_efi_memmap" by default? From the
> > commit
> > "200001eb140e: x86 boot: only pick up additional EFI memmap if
> > add_efi_memmap
> > flag", I didn't understand why the decision was made so. Shouldn't we give
> > more preference to EFI Memory map rather than E820 table as it's the
> > latest
> > and E820 is legacy?
> 
> I did a a quick experiment with and without add_efi_memmmap. the e820
> table looked exactly the same. I guess this shows that what I wrote
> above makes sense ;) . Have you observed difference?

When I did a quick test, I didn't notice any difference (with and without
add_efi_memap) because both e820 and EFI Memory Map were reporting regions in
sync. So, "add_efi_memmap" didn't have to add any new regions into e820. Hence
my last question, what if both the tables (EFI Memory Map and e820) are out of
sync? Shouldn't happen in Grub and EFI stub because they generate e820 from
EFI Memory Map, as pointed by you.

Regards,
Sai


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Why does memblock only refer to E820 table and not EFI Memory Map?
@ 2019-07-23 22:01     ` Sai Praneeth Prakhya
  0 siblings, 0 replies; 7+ messages in thread
From: Sai Praneeth Prakhya @ 2019-07-23 22:01 UTC (permalink / raw)
  To: Ricardo Neri
  Cc: linux-mm, linux-efi, mingo, bp, peterz, ard.biesheuvel, rppt, pj


> > On x86 platforms, there are two sources through which kernel learns about
> > physical memory in the system namely E820 table and EFI Memory Map. Each
> > table
> > describes which regions of system memory is usable by kernel and which
> > regions
> > should be preserved (i.e. reserved regions that typically have BIOS
> > code/data)
> > so that no other component in the system could read/write to these
> > regions. I
> > think they are duplicating the information and hence I have couple of
> > questions regarding these
> 
> But isn't it true that in x86 systems the E820 table is populated from the
> EFI memory map?

I don't know that it happens.. :(

> At least in systems with EFI firmware and a Linux which understands
> EFI. If booting from the EFI stub, the stub will take the EFI memory map and
> assemble the E820 table passed as part of the boot params [4]. It also
> considers the case when there are more than 128 entries in the table [5].
> Thus, if booting as an EFI application it will definitely use the EFI memory
> map. If Linux' EFI entry point is not used the bootloader should to the
> same. For instance, grub also reads the EFI memory map to assemble the E820
> memory map [6], [7], [8].

Thanks a lot! for the pointers Ricardo :)
I haven't looked at EFI stub and Grub code and hence didn't knew this was
happening. It does make me feel better that EFI Memory Map is indeed being
used to generate e820 in EFI stub case, so at-least it's getting consumed
indirectly.

> > 1. I see that only E820 table is being consumed by kernel [1] (i.e.
> > memblock
> > subsystem in kernel) to distinguish between "usable" vs "reserved"
> > regions.
> > Assume someone has called memblock_alloc(), the memblock subsystem would
> > service the caller by allocating memory from "usable" regions and it knows
> > this *only* from E820 table [2] (it does not check if EFI Memory Map also
> > says
> > that this region is usable as well). So, why isn't the kernel taking EFI
> > Memory Map into consideration? (I see that it does happen only when
> > "add_efi_memmap" kernel command line arg is passed i.e. passing this
> > argument
> > updates E820 table based on EFI Memory Map) [3]. The problem I see with
> > memblock not taking EFI Memory Map into consideration is that, we are
> > ignoring
> > the main purpose for which EFI Memory Map exists.
> > 
> > 2. Why doesn't the kernel have "add_efi_memmap" by default? From the
> > commit
> > "200001eb140e: x86 boot: only pick up additional EFI memmap if
> > add_efi_memmap
> > flag", I didn't understand why the decision was made so. Shouldn't we give
> > more preference to EFI Memory map rather than E820 table as it's the
> > latest
> > and E820 is legacy?
> 
> I did a a quick experiment with and without add_efi_memmmap. the e820
> table looked exactly the same. I guess this shows that what I wrote
> above makes sense ;) . Have you observed difference?

When I did a quick test, I didn't notice any difference (with and without
add_efi_memap) because both e820 and EFI Memory Map were reporting regions in
sync. So, "add_efi_memmap" didn't have to add any new regions into e820. Hence
my last question, what if both the tables (EFI Memory Map and e820) are out of
sync? Shouldn't happen in Grub and EFI stub because they generate e820 from
EFI Memory Map, as pointed by you.

Regards,
Sai


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-07-23 22:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-20 22:52 Why does memblock only refer to E820 table and not EFI Memory Map? Sai Praneeth Prakhya
2019-07-20 22:52 ` Sai Praneeth Prakhya
2019-07-23  8:09 ` Dave Young
2019-07-23 17:11   ` Prakhya, Sai Praneeth
2019-07-23 21:38 ` Ricardo Neri
2019-07-23 22:01   ` Sai Praneeth Prakhya
2019-07-23 22:01     ` Sai Praneeth Prakhya

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.