linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/2] x86/efi: don't allocate memmap through memblock after mm_init()
@ 2016-12-22 10:23 Nicolai Stange
  2016-12-22 10:23 ` [PATCH v2 2/2] efi: efi_mem_reserve(): don't reserve " Nicolai Stange
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Nicolai Stange @ 2016-12-22 10:23 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Ard Biesheuvel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	x86, linux-efi, linux-kernel, Mika Penttilä,
	Nicolai Stange

With commit 4bc9f92e64c8 ("x86/efi-bgrt: Use efi_mem_reserve() to avoid
copying image data"), efi_bgrt_init() calls into the memblock allocator
through efi_mem_reserve() => efi_arch_mem_reserve() *after* mm_init()
has been called.

Indeed, KASAN reports a bad read access later on in
efi_free_boot_services():

  BUG: KASAN: use-after-free in efi_free_boot_services+0xae/0x24c
            at addr ffff88022de12740
  Read of size 4 by task swapper/0/0
  page:ffffea0008b78480 count:0 mapcount:-127
  mapping:          (null) index:0x1 flags: 0x5fff8000000000()
  [...]
  Call Trace:
   dump_stack+0x68/0x9f
   kasan_report_error+0x4c8/0x500
   kasan_report+0x58/0x60
   __asan_load4+0x61/0x80
   efi_free_boot_services+0xae/0x24c
   start_kernel+0x527/0x562
   x86_64_start_reservations+0x24/0x26
   x86_64_start_kernel+0x157/0x17a
   start_cpu+0x5/0x14

The instruction at the given address is the first read from the memmap's
memory, i.e. the read of md->type in efi_free_boot_services().

Note that the writes earlier in efi_arch_mem_reserve() don't splat because
they're done through early_memremap()ed addresses.

So, after memblock is gone, allocations should be done through the "normal"
page allocator. Introduce a helper, efi_memmap_alloc() for this. Use
it from efi_arch_mem_reserve() and from efi_free_boot_services() as well.

Fixes: 4bc9f92e64c8 ("x86/efi-bgrt: Use efi_mem_reserve() to avoid copying image data")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
---
 arch/x86/platform/efi/quirks.c |  4 ++--
 drivers/firmware/efi/memmap.c  | 38 ++++++++++++++++++++++++++++++++++++++
 include/linux/efi.h            |  1 +
 3 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index 10aca63a50d7..30031d5293c4 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -214,7 +214,7 @@ void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size)
 
 	new_size = efi.memmap.desc_size * num_entries;
 
-	new_phys = memblock_alloc(new_size, 0);
+	new_phys = efi_memmap_alloc(num_entries);
 	if (!new_phys) {
 		pr_err("Could not allocate boot services memmap\n");
 		return;
@@ -355,7 +355,7 @@ void __init efi_free_boot_services(void)
 	}
 
 	new_size = efi.memmap.desc_size * num_entries;
-	new_phys = memblock_alloc(new_size, 0);
+	new_phys = efi_memmap_alloc(num_entries);
 	if (!new_phys) {
 		pr_err("Failed to allocate new EFI memmap\n");
 		return;
diff --git a/drivers/firmware/efi/memmap.c b/drivers/firmware/efi/memmap.c
index f03ddecd232b..78686443cb37 100644
--- a/drivers/firmware/efi/memmap.c
+++ b/drivers/firmware/efi/memmap.c
@@ -9,6 +9,44 @@
 #include <linux/efi.h>
 #include <linux/io.h>
 #include <asm/early_ioremap.h>
+#include <linux/memblock.h>
+#include <linux/slab.h>
+
+static phys_addr_t __init __efi_memmap_alloc_early(unsigned long size)
+{
+	return memblock_alloc(size, 0);
+}
+
+static phys_addr_t __init __efi_memmap_alloc_late(unsigned long size)
+{
+	unsigned int order = get_order(size);
+	struct page *p = alloc_pages(GFP_KERNEL, order);
+
+	if (!p)
+		return 0;
+
+	return PFN_PHYS(page_to_pfn(p));
+}
+
+/**
+ * efi_memmap_alloc - Allocate memory for the EFI memory map
+ * @num_entries: Number of entries in the allocated map.
+ *
+ * Depending on whether mm_init() has already been invoked or not,
+ * either memblock or "normal" page allocation is used.
+ *
+ * Returns the physical address of the allocated memory map on
+ * success, zero on failure.
+ */
+phys_addr_t __init efi_memmap_alloc(unsigned int num_entries)
+{
+	unsigned long size = num_entries * efi.memmap.desc_size;
+
+	if (slab_is_available())
+		return __efi_memmap_alloc_late(size);
+
+	return __efi_memmap_alloc_early(size);
+}
 
 /**
  * __efi_memmap_init - Common code for mapping the EFI memory map
diff --git a/include/linux/efi.h b/include/linux/efi.h
index a07a476178cd..0c5420208c40 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -950,6 +950,7 @@ static inline efi_status_t efi_query_variable_store(u32 attributes,
 #endif
 extern void __iomem *efi_lookup_mapped_addr(u64 phys_addr);
 
+extern phys_addr_t __init efi_memmap_alloc(unsigned int num_entries);
 extern int __init efi_memmap_init_early(struct efi_memory_map_data *data);
 extern int __init efi_memmap_init_late(phys_addr_t addr, unsigned long size);
 extern void __init efi_memmap_unmap(void);
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v2 2/2] efi: efi_mem_reserve(): don't reserve through memblock after mm_init()
  2016-12-22 10:23 [PATCH v2 1/2] x86/efi: don't allocate memmap through memblock after mm_init() Nicolai Stange
@ 2016-12-22 10:23 ` Nicolai Stange
  2017-01-05  9:12   ` Dave Young
  2016-12-23 14:52 ` [PATCH v2 1/2] x86/efi: don't allocate memmap " Matt Fleming
  2017-01-04 18:40 ` Dan Williams
  2 siblings, 1 reply; 19+ messages in thread
From: Nicolai Stange @ 2016-12-22 10:23 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Ard Biesheuvel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	x86, linux-efi, linux-kernel, Mika Penttilä,
	Nicolai Stange

Before invoking the arch specific handler, efi_mem_reserve() reserves
the given memory region through memblock.

efi_mem_reserve() can get called after mm_init() though -- through
efi_bgrt_init(), for example. After mm_init(), memblock is dead and should
not be used anymore.

Let efi_mem_reserve() check whether memblock is dead and not do the
reservation if so. Emit a warning from the generic efi_arch mem_reserve()
in this case: if the architecture doesn't provide any other means of
registering the region as reserved, the operation would be a nop.

Fixes: 4bc9f92e64c8 ("x86/efi-bgrt: Use efi_mem_reserve() to avoid copying image data")
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
---
 Changes to v1:
 Change the if condition from slab_is_available() to !slab_is_available
 as pointed out by Mika Penttilä at
 http://lkml.kernel.org/r/c7bf34ba-56f0-8346-36d1-7069f2115dcf@nextfour.com

 drivers/firmware/efi/efi.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 92914801e388..158a8df2f4af 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -403,7 +403,10 @@ u64 __init efi_mem_desc_end(efi_memory_desc_t *md)
 	return end;
 }
 
-void __init __weak efi_arch_mem_reserve(phys_addr_t addr, u64 size) {}
+void __init __weak efi_arch_mem_reserve(phys_addr_t addr, u64 size)
+{
+	WARN(slab_is_available(), "efi_mem_reserve() has no effect");
+}
 
 /**
  * efi_mem_reserve - Reserve an EFI memory region
@@ -419,7 +422,7 @@ void __init __weak efi_arch_mem_reserve(phys_addr_t addr, u64 size) {}
  */
 void __init efi_mem_reserve(phys_addr_t addr, u64 size)
 {
-	if (!memblock_is_region_reserved(addr, size))
+	if (!slab_is_available() && !memblock_is_region_reserved(addr, size))
 		memblock_reserve(addr, size);
 
 	/*
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] x86/efi: don't allocate memmap through memblock after mm_init()
  2016-12-22 10:23 [PATCH v2 1/2] x86/efi: don't allocate memmap through memblock after mm_init() Nicolai Stange
  2016-12-22 10:23 ` [PATCH v2 2/2] efi: efi_mem_reserve(): don't reserve " Nicolai Stange
@ 2016-12-23 14:52 ` Matt Fleming
  2016-12-23 21:12   ` Nicolai Stange
  2017-01-04 18:40 ` Dan Williams
  2 siblings, 1 reply; 19+ messages in thread
From: Matt Fleming @ 2016-12-23 14:52 UTC (permalink / raw)
  To: Nicolai Stange
  Cc: Ard Biesheuvel, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	x86, linux-efi, linux-kernel, Mika Penttilä

On Thu, 22 Dec, at 11:23:39AM, Nicolai Stange wrote:
> With commit 4bc9f92e64c8 ("x86/efi-bgrt: Use efi_mem_reserve() to avoid
> copying image data"), efi_bgrt_init() calls into the memblock allocator
> through efi_mem_reserve() => efi_arch_mem_reserve() *after* mm_init()
> has been called.
> 
> Indeed, KASAN reports a bad read access later on in
> efi_free_boot_services():
> 
>   BUG: KASAN: use-after-free in efi_free_boot_services+0xae/0x24c
>             at addr ffff88022de12740
>   Read of size 4 by task swapper/0/0
>   page:ffffea0008b78480 count:0 mapcount:-127
>   mapping:          (null) index:0x1 flags: 0x5fff8000000000()
>   [...]
>   Call Trace:
>    dump_stack+0x68/0x9f
>    kasan_report_error+0x4c8/0x500
>    kasan_report+0x58/0x60
>    __asan_load4+0x61/0x80
>    efi_free_boot_services+0xae/0x24c
>    start_kernel+0x527/0x562
>    x86_64_start_reservations+0x24/0x26
>    x86_64_start_kernel+0x157/0x17a
>    start_cpu+0x5/0x14
> 
> The instruction at the given address is the first read from the memmap's
> memory, i.e. the read of md->type in efi_free_boot_services().
> 
> Note that the writes earlier in efi_arch_mem_reserve() don't splat because
> they're done through early_memremap()ed addresses.
> 
> So, after memblock is gone, allocations should be done through the "normal"
> page allocator. Introduce a helper, efi_memmap_alloc() for this. Use
> it from efi_arch_mem_reserve() and from efi_free_boot_services() as well.
> 
> Fixes: 4bc9f92e64c8 ("x86/efi-bgrt: Use efi_mem_reserve() to avoid copying image data")
> Signed-off-by: Nicolai Stange <nicstange@gmail.com>
> ---
>  arch/x86/platform/efi/quirks.c |  4 ++--
>  drivers/firmware/efi/memmap.c  | 38 ++++++++++++++++++++++++++++++++++++++
>  include/linux/efi.h            |  1 +
>  3 files changed, 41 insertions(+), 2 deletions(-)

Nice catch. Could you also modify efi_fake_memmap() to use your new
efi_memmap_alloc() function for consistency (note that all
memblock_alloc()s should probably be PAGE_SIZE aligned like the
fakemem code)?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] x86/efi: don't allocate memmap through memblock after mm_init()
  2016-12-23 14:52 ` [PATCH v2 1/2] x86/efi: don't allocate memmap " Matt Fleming
@ 2016-12-23 21:12   ` Nicolai Stange
  2017-01-05  7:42     ` Ingo Molnar
  0 siblings, 1 reply; 19+ messages in thread
From: Nicolai Stange @ 2016-12-23 21:12 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Nicolai Stange, Ard Biesheuvel, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, linux-efi, linux-kernel, Mika Penttilä

Matt Fleming <matt@codeblueprint.co.uk> writes:

> On Thu, 22 Dec, at 11:23:39AM, Nicolai Stange wrote:
>> So, after memblock is gone, allocations should be done through the "normal"
>> page allocator. Introduce a helper, efi_memmap_alloc() for this. Use
>> it from efi_arch_mem_reserve() and from efi_free_boot_services() as well.
>> 
>> Fixes: 4bc9f92e64c8 ("x86/efi-bgrt: Use efi_mem_reserve() to avoid copying image data")
>> Signed-off-by: Nicolai Stange <nicstange@gmail.com>

> Could you also modify efi_fake_memmap() to use your new
> efi_memmap_alloc() function for consistency

Sure.

I'm planning to submit another set of patches addressing the (bounded)
memmap leaking in anything calling efi_memmap_unmap() though. In the
course of doing so, the memmap allocation sites will get touched anyway:
I'll have to store some information about how the memmap's memory has
been obtained.

> (note that all memblock_alloc()s should probably be PAGE_SIZE aligned
> like the fakemem code)?

Ok, but I'd really like to understand why: I can't find anything in
neither the code nor in the UEFI spec requiring this. And up to now,
efi_arch_mem_reserve() as well as efi_free_boot_services() used to do
those unaligned allocations...

In light of this, is there really a necessity for using whole page
allocations after mm_init() or would kmalloc() suffice here?
Provided that the memremap bits get adjusted accordingly, of course.

So, I'm thinking of turning the ->late boolean into a tristate like the
following:

Memory allocated by | Memory mapped through
--------------------|----------------------
memblock            | early_memremap
memblock            | memremap
kmalloc             | -

Neglecting slub overhead, the use of kmalloc() over alloc_pages() would
save 4096 - 12*40 == 3616 Bytes on my system with its 12 entries under
/sys/firmware/efi/runtime-map/. Not really critical, but if it comes for
free, why not?


Thanks,

Nicolai

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] x86/efi: don't allocate memmap through memblock after mm_init()
  2016-12-22 10:23 [PATCH v2 1/2] x86/efi: don't allocate memmap through memblock after mm_init() Nicolai Stange
  2016-12-22 10:23 ` [PATCH v2 2/2] efi: efi_mem_reserve(): don't reserve " Nicolai Stange
  2016-12-23 14:52 ` [PATCH v2 1/2] x86/efi: don't allocate memmap " Matt Fleming
@ 2017-01-04 18:40 ` Dan Williams
  2 siblings, 0 replies; 19+ messages in thread
From: Dan Williams @ 2017-01-04 18:40 UTC (permalink / raw)
  To: Nicolai Stange
  Cc: Matt Fleming, Ard Biesheuvel, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, the arch/x86 maintainers, linux-efi,
	Linux Kernel Mailing List, Mika Penttilä

On Thu, Dec 22, 2016 at 2:23 AM, Nicolai Stange <nicstange@gmail.com> wrote:
> With commit 4bc9f92e64c8 ("x86/efi-bgrt: Use efi_mem_reserve() to avoid
> copying image data"), efi_bgrt_init() calls into the memblock allocator
> through efi_mem_reserve() => efi_arch_mem_reserve() *after* mm_init()
> has been called.
>
> Indeed, KASAN reports a bad read access later on in
> efi_free_boot_services():
>
>   BUG: KASAN: use-after-free in efi_free_boot_services+0xae/0x24c
>             at addr ffff88022de12740
>   Read of size 4 by task swapper/0/0
>   page:ffffea0008b78480 count:0 mapcount:-127
>   mapping:          (null) index:0x1 flags: 0x5fff8000000000()
>   [...]
>   Call Trace:
>    dump_stack+0x68/0x9f
>    kasan_report_error+0x4c8/0x500
>    kasan_report+0x58/0x60
>    __asan_load4+0x61/0x80
>    efi_free_boot_services+0xae/0x24c
>    start_kernel+0x527/0x562
>    x86_64_start_reservations+0x24/0x26
>    x86_64_start_kernel+0x157/0x17a
>    start_cpu+0x5/0x14
>
> The instruction at the given address is the first read from the memmap's
> memory, i.e. the read of md->type in efi_free_boot_services().
>
> Note that the writes earlier in efi_arch_mem_reserve() don't splat because
> they're done through early_memremap()ed addresses.
>
> So, after memblock is gone, allocations should be done through the "normal"
> page allocator. Introduce a helper, efi_memmap_alloc() for this. Use
> it from efi_arch_mem_reserve() and from efi_free_boot_services() as well.
>
> Fixes: 4bc9f92e64c8 ("x86/efi-bgrt: Use efi_mem_reserve() to avoid copying image data")
> Signed-off-by: Nicolai Stange <nicstange@gmail.com>

Tested-by: Dan Williams <dan.j.williams@intel.com>

This addresses the v4.9 boot regression I reported:

http://marc.info/?l=linux-kernel&m=148349836923836&w=2

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] x86/efi: don't allocate memmap through memblock after mm_init()
  2016-12-23 21:12   ` Nicolai Stange
@ 2017-01-05  7:42     ` Ingo Molnar
  2017-01-05  9:15       ` Dave Young
  2017-01-05  9:39       ` Ard Biesheuvel
  0 siblings, 2 replies; 19+ messages in thread
From: Ingo Molnar @ 2017-01-05  7:42 UTC (permalink / raw)
  To: Nicolai Stange
  Cc: Matt Fleming, Ard Biesheuvel, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, linux-efi, linux-kernel, Mika Penttilä,
	Dan Williams


* Nicolai Stange <nicstange@gmail.com> wrote:

> Matt Fleming <matt@codeblueprint.co.uk> writes:
> 
> > On Thu, 22 Dec, at 11:23:39AM, Nicolai Stange wrote:
> >> So, after memblock is gone, allocations should be done through the "normal"
> >> page allocator. Introduce a helper, efi_memmap_alloc() for this. Use
> >> it from efi_arch_mem_reserve() and from efi_free_boot_services() as well.
> >> 
> >> Fixes: 4bc9f92e64c8 ("x86/efi-bgrt: Use efi_mem_reserve() to avoid copying image data")
> >> Signed-off-by: Nicolai Stange <nicstange@gmail.com>
> 
> > Could you also modify efi_fake_memmap() to use your new
> > efi_memmap_alloc() function for consistency
> 
> Sure.
> 
> I'm planning to submit another set of patches addressing the (bounded)
> memmap leaking in anything calling efi_memmap_unmap() though. In the
> course of doing so, the memmap allocation sites will get touched anyway:
> I'll have to store some information about how the memmap's memory has
> been obtained.

Will that patch be intrusive?

If yes then we'll need to keep this a separate urgent patch to fix the v4.9 
regression that Dan Williams reported. I can apply the fix to efi/urgent and get 
it to Linus straight away if you guys agree.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/2] efi: efi_mem_reserve(): don't reserve through memblock after mm_init()
  2016-12-22 10:23 ` [PATCH v2 2/2] efi: efi_mem_reserve(): don't reserve " Nicolai Stange
@ 2017-01-05  9:12   ` Dave Young
  2017-01-09 11:44     ` Matt Fleming
  0 siblings, 1 reply; 19+ messages in thread
From: Dave Young @ 2017-01-05  9:12 UTC (permalink / raw)
  To: Nicolai Stange
  Cc: Matt Fleming, Ard Biesheuvel, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, linux-efi, linux-kernel, Mika Penttilä

On 12/22/16 at 11:23am, Nicolai Stange wrote:
> Before invoking the arch specific handler, efi_mem_reserve() reserves
> the given memory region through memblock.
> 
> efi_mem_reserve() can get called after mm_init() though -- through
> efi_bgrt_init(), for example. After mm_init(), memblock is dead and should
> not be used anymore.

It did not fail during previous test so we did not catch this bug, if memblock
can not be used after mm_init(), IMHO it should fail instead of silently succeed.

Matt, can we move the efi_mem_reserve to earlier code for example in
efi_memblock_x86_reserve_range just after reserving the memmap?

> 
> Let efi_mem_reserve() check whether memblock is dead and not do the
> reservation if so. Emit a warning from the generic efi_arch mem_reserve()
> in this case: if the architecture doesn't provide any other means of
> registering the region as reserved, the operation would be a nop.
> 
> Fixes: 4bc9f92e64c8 ("x86/efi-bgrt: Use efi_mem_reserve() to avoid copying image data")
> Signed-off-by: Nicolai Stange <nicstange@gmail.com>
> ---
>  Changes to v1:
>  Change the if condition from slab_is_available() to !slab_is_available
>  as pointed out by Mika Penttilä at
>  http://lkml.kernel.org/r/c7bf34ba-56f0-8346-36d1-7069f2115dcf@nextfour.com
> 
>  drivers/firmware/efi/efi.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index 92914801e388..158a8df2f4af 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -403,7 +403,10 @@ u64 __init efi_mem_desc_end(efi_memory_desc_t *md)
>  	return end;
>  }
>  
> -void __init __weak efi_arch_mem_reserve(phys_addr_t addr, u64 size) {}
> +void __init __weak efi_arch_mem_reserve(phys_addr_t addr, u64 size)
> +{
> +	WARN(slab_is_available(), "efi_mem_reserve() has no effect");
> +}
>  
>  /**
>   * efi_mem_reserve - Reserve an EFI memory region
> @@ -419,7 +422,7 @@ void __init __weak efi_arch_mem_reserve(phys_addr_t addr, u64 size) {}
>   */
>  void __init efi_mem_reserve(phys_addr_t addr, u64 size)
>  {
> -	if (!memblock_is_region_reserved(addr, size))
> +	if (!slab_is_available() && !memblock_is_region_reserved(addr, size))
>  		memblock_reserve(addr, size);
>  
>  	/*
> -- 
> 2.11.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-efi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Thanks
Dave

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] x86/efi: don't allocate memmap through memblock after mm_init()
  2017-01-05  7:42     ` Ingo Molnar
@ 2017-01-05  9:15       ` Dave Young
  2017-01-05  9:39       ` Ard Biesheuvel
  1 sibling, 0 replies; 19+ messages in thread
From: Dave Young @ 2017-01-05  9:15 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Nicolai Stange, Matt Fleming, Ard Biesheuvel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, x86, linux-efi, linux-kernel,
	Mika Penttilä,
	Dan Williams

On 01/05/17 at 08:42am, Ingo Molnar wrote:
> 
> * Nicolai Stange <nicstange@gmail.com> wrote:
> 
> > Matt Fleming <matt@codeblueprint.co.uk> writes:
> > 
> > > On Thu, 22 Dec, at 11:23:39AM, Nicolai Stange wrote:
> > >> So, after memblock is gone, allocations should be done through the "normal"
> > >> page allocator. Introduce a helper, efi_memmap_alloc() for this. Use
> > >> it from efi_arch_mem_reserve() and from efi_free_boot_services() as well.
> > >> 
> > >> Fixes: 4bc9f92e64c8 ("x86/efi-bgrt: Use efi_mem_reserve() to avoid copying image data")
> > >> Signed-off-by: Nicolai Stange <nicstange@gmail.com>
> > 
> > > Could you also modify efi_fake_memmap() to use your new
> > > efi_memmap_alloc() function for consistency
> > 
> > Sure.
> > 
> > I'm planning to submit another set of patches addressing the (bounded)
> > memmap leaking in anything calling efi_memmap_unmap() though. In the
> > course of doing so, the memmap allocation sites will get touched anyway:
> > I'll have to store some information about how the memmap's memory has
> > been obtained.
> 
> Will that patch be intrusive?
> 
> If yes then we'll need to keep this a separate urgent patch to fix the v4.9 
> regression that Dan Williams reported. I can apply the fix to efi/urgent and get 
> it to Linus straight away if you guys agree.

Ditto question to Matt as I asked in reply to patch 2/2, can we move the
efi_mem_reserve to early code so that memblock is still usable and
consider to improve it in other way later?

> 
> Thanks,
> 
> 	Ingo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-efi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] x86/efi: don't allocate memmap through memblock after mm_init()
  2017-01-05  7:42     ` Ingo Molnar
  2017-01-05  9:15       ` Dave Young
@ 2017-01-05  9:39       ` Ard Biesheuvel
  2017-01-05 10:15         ` Nicolai Stange
  1 sibling, 1 reply; 19+ messages in thread
From: Ard Biesheuvel @ 2017-01-05  9:39 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Nicolai Stange, Matt Fleming, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, linux-efi, linux-kernel, Mika Penttilä,
	Dan Williams

On 5 January 2017 at 07:42, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Nicolai Stange <nicstange@gmail.com> wrote:
>
>> Matt Fleming <matt@codeblueprint.co.uk> writes:
>>
>> > On Thu, 22 Dec, at 11:23:39AM, Nicolai Stange wrote:
>> >> So, after memblock is gone, allocations should be done through the "normal"
>> >> page allocator. Introduce a helper, efi_memmap_alloc() for this. Use
>> >> it from efi_arch_mem_reserve() and from efi_free_boot_services() as well.
>> >>
>> >> Fixes: 4bc9f92e64c8 ("x86/efi-bgrt: Use efi_mem_reserve() to avoid copying image data")
>> >> Signed-off-by: Nicolai Stange <nicstange@gmail.com>
>>
>> > Could you also modify efi_fake_memmap() to use your new
>> > efi_memmap_alloc() function for consistency
>>
>> Sure.
>>
>> I'm planning to submit another set of patches addressing the (bounded)
>> memmap leaking in anything calling efi_memmap_unmap() though. In the
>> course of doing so, the memmap allocation sites will get touched anyway:
>> I'll have to store some information about how the memmap's memory has
>> been obtained.
>
> Will that patch be intrusive?
>

Given that memblock_alloc() calls memblock_reserve() on its
allocations, we could simply consult the memblock_reserved table to
infer whether the allocation being freed was created with
memblock_alloc() or with alloc_pages(). So I don't think such a patch
should be that intrusive. But the normal case is that the EFI memory
map remains mapped during the lifetime of the system, and unmapping
the EFI memory map does not necessarily imply that it should be freed.
This is especially true on ARM systems, where the memory map is
allocated and populated by the stub, and never modified by the kernel
proper, and so any freeing logic in generic code should take this into
account as well (i.e., the memory map allocation is not
memblock_reserve()'d, nor is it allocated using alloc_pages())

> If yes then we'll need to keep this a separate urgent patch to fix the v4.9
> regression that Dan Williams reported. I can apply the fix to efi/urgent and get
> it to Linus straight away if you guys agree.
>

Considering the severity of the issue it solves, and the obvious
correctness of the fix, my preference would be to spin a v3 of this
patch taking Matt's feedback into account, and merging that as a fix
for v4.10 with a cc stable. The 2/2 can wait a bit longer imo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] x86/efi: don't allocate memmap through memblock after mm_init()
  2017-01-05  9:39       ` Ard Biesheuvel
@ 2017-01-05 10:15         ` Nicolai Stange
  2017-01-05 11:34           ` Ard Biesheuvel
  0 siblings, 1 reply; 19+ messages in thread
From: Nicolai Stange @ 2017-01-05 10:15 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Ingo Molnar, Nicolai Stange, Matt Fleming, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, x86, linux-efi, linux-kernel,
	Mika Penttilä,
	Dan Williams

Ard Biesheuvel <ard.biesheuvel@linaro.org> writes:

> On 5 January 2017 at 07:42, Ingo Molnar <mingo@kernel.org> wrote:
>>
>> * Nicolai Stange <nicstange@gmail.com> wrote:
>>
>>> Matt Fleming <matt@codeblueprint.co.uk> writes:
>>>
>>> > On Thu, 22 Dec, at 11:23:39AM, Nicolai Stange wrote:
>>> >> So, after memblock is gone, allocations should be done through
>>> >> the "normal"
>>> >> page allocator. Introduce a helper, efi_memmap_alloc() for this. Use
>>> >> it from efi_arch_mem_reserve() and from efi_free_boot_services() as well.
>>> >>
>>> >> Fixes: 4bc9f92e64c8 ("x86/efi-bgrt: Use efi_mem_reserve() to
>>> >> avoid copying image data")
>>> >> Signed-off-by: Nicolai Stange <nicstange@gmail.com>
>>>
>>> > Could you also modify efi_fake_memmap() to use your new
>>> > efi_memmap_alloc() function for consistency
>>>
>>> Sure.
>>>
>>> I'm planning to submit another set of patches addressing the (bounded)
>>> memmap leaking in anything calling efi_memmap_unmap() though. In the
>>> course of doing so, the memmap allocation sites will get touched anyway:
>>> I'll have to store some information about how the memmap's memory has
>>> been obtained.
>>
>> Will that patch be intrusive?

Yes, definitely something for 4.11+.


> Given that memblock_alloc() calls memblock_reserve() on its
> allocations, we could simply consult the memblock_reserved table to
> infer whether the allocation being freed was created with
> memblock_alloc() or with alloc_pages().

Not sure whether this would work with CONFIG_ARCH_DISCARD_MEMBLOCK=y.
This is also the reason why 2/2 is needed.

> So I don't think such a patch
> should be that intrusive. But the normal case is that the EFI memory
> map remains mapped during the lifetime of the system, and unmapping
> the EFI memory map does not necessarily imply that it should be freed.
> This is especially true on ARM systems, where the memory map is
> allocated and populated by the stub, and never modified by the kernel
> proper, and so any freeing logic in generic code should take this into
> account as well (i.e., the memory map allocation is not
> memblock_reserve()'d, nor is it allocated using alloc_pages())



>> If yes then we'll need to keep this a separate urgent patch to fix the v4.9
>> regression that Dan Williams reported. I can apply the fix to
>> efi/urgent and get
>> it to Linus straight away if you guys agree.
>>
>
> Considering the severity of the issue it solves, and the obvious
> correctness of the fix, my preference would be to spin a v3 of this
> patch taking Matt's feedback into account, and merging that as a fix
> for v4.10 with a cc stable. The 2/2 can wait a bit longer imo

Matt's Feedback included that 

  "all memblock_alloc()s should probably be PAGE_SIZE aligned like the
   fakemem code"

Unfortunately, I can't see why this would be needed. Furthermore, this
isn't currently done outside of fakemem and thus, aliging the memmap
allocations on PAGE_SIZE would be another, quite unrelated change?

So, are you Ok with only taking the other review comment, namely

  "modify efi_fake_memmap() to use your new efi_memmap_alloc() function
   for consistency"

into account for a v3?

Thanks,

Nicolai

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] x86/efi: don't allocate memmap through memblock after mm_init()
  2017-01-05 10:15         ` Nicolai Stange
@ 2017-01-05 11:34           ` Ard Biesheuvel
  2017-01-05 12:53             ` Nicolai Stange
  0 siblings, 1 reply; 19+ messages in thread
From: Ard Biesheuvel @ 2017-01-05 11:34 UTC (permalink / raw)
  To: Nicolai Stange
  Cc: Ingo Molnar, Matt Fleming, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, linux-efi, linux-kernel, Mika Penttilä,
	Dan Williams

On 5 January 2017 at 10:15, Nicolai Stange <nicstange@gmail.com> wrote:
> Ard Biesheuvel <ard.biesheuvel@linaro.org> writes:
>
>> On 5 January 2017 at 07:42, Ingo Molnar <mingo@kernel.org> wrote:
>>>
>>> * Nicolai Stange <nicstange@gmail.com> wrote:
>>>
>>>> Matt Fleming <matt@codeblueprint.co.uk> writes:
>>>>
>>>> > On Thu, 22 Dec, at 11:23:39AM, Nicolai Stange wrote:
>>>> >> So, after memblock is gone, allocations should be done through
>>>> >> the "normal"
>>>> >> page allocator. Introduce a helper, efi_memmap_alloc() for this. Use
>>>> >> it from efi_arch_mem_reserve() and from efi_free_boot_services() as well.
>>>> >>
>>>> >> Fixes: 4bc9f92e64c8 ("x86/efi-bgrt: Use efi_mem_reserve() to
>>>> >> avoid copying image data")
>>>> >> Signed-off-by: Nicolai Stange <nicstange@gmail.com>
>>>>
>>>> > Could you also modify efi_fake_memmap() to use your new
>>>> > efi_memmap_alloc() function for consistency
>>>>
>>>> Sure.
>>>>
>>>> I'm planning to submit another set of patches addressing the (bounded)
>>>> memmap leaking in anything calling efi_memmap_unmap() though. In the
>>>> course of doing so, the memmap allocation sites will get touched anyway:
>>>> I'll have to store some information about how the memmap's memory has
>>>> been obtained.
>>>
>>> Will that patch be intrusive?
>
> Yes, definitely something for 4.11+.
>
>
>> Given that memblock_alloc() calls memblock_reserve() on its
>> allocations, we could simply consult the memblock_reserved table to
>> infer whether the allocation being freed was created with
>> memblock_alloc() or with alloc_pages().
>
> Not sure whether this would work with CONFIG_ARCH_DISCARD_MEMBLOCK=y.
> This is also the reason why 2/2 is needed.
>

OK, I hadn't considered that. It is rather unfortunate that on x86,
each call to efi_arch_mem_reserve() could potentially result in a new
allocation to be created for the entire memory map (consisting of
hundreds of 40-byte entries), leaking the old one. So I agree that
this code needs some work, and I also agree that it is v4.11 material
at the earliest.

>> So I don't think such a patch
>> should be that intrusive. But the normal case is that the EFI memory
>> map remains mapped during the lifetime of the system, and unmapping
>> the EFI memory map does not necessarily imply that it should be freed.
>> This is especially true on ARM systems, where the memory map is
>> allocated and populated by the stub, and never modified by the kernel
>> proper, and so any freeing logic in generic code should take this into
>> account as well (i.e., the memory map allocation is not
>> memblock_reserve()'d, nor is it allocated using alloc_pages())
>
>
>
>>> If yes then we'll need to keep this a separate urgent patch to fix the v4.9
>>> regression that Dan Williams reported. I can apply the fix to
>>> efi/urgent and get
>>> it to Linus straight away if you guys agree.
>>>
>>
>> Considering the severity of the issue it solves, and the obvious
>> correctness of the fix, my preference would be to spin a v3 of this
>> patch taking Matt's feedback into account, and merging that as a fix
>> for v4.10 with a cc stable. The 2/2 can wait a bit longer imo
>
> Matt's Feedback included that
>
>   "all memblock_alloc()s should probably be PAGE_SIZE aligned like the
>    fakemem code"
>
> Unfortunately, I can't see why this would be needed. Furthermore, this
> isn't currently done outside of fakemem and thus, aliging the memmap
> allocations on PAGE_SIZE would be another, quite unrelated change?
>

We use a pool allocation for the memory map on ARM, which is never
page aligned (and page aligned in EFI doesn't mean as much when the OS
page size is 16k or 64k) so I think you are correct here.
> So, are you Ok with only taking the other review comment, namely
>
>   "modify efi_fake_memmap() to use your new efi_memmap_alloc() function
>    for consistency"
>
> into account for a v3?
>

Yes, that should be sufficient for a fix that can be backported.
Anything else can wait for now

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/2] x86/efi: don't allocate memmap through memblock after mm_init()
  2017-01-05 11:34           ` Ard Biesheuvel
@ 2017-01-05 12:53             ` Nicolai Stange
  0 siblings, 0 replies; 19+ messages in thread
From: Nicolai Stange @ 2017-01-05 12:53 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Nicolai Stange, Ingo Molnar, Matt Fleming, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, x86, linux-efi, linux-kernel,
	Mika Penttilä,
	Dan Williams

Ard Biesheuvel <ard.biesheuvel@linaro.org> writes:
> On 5 January 2017 at 10:15, Nicolai Stange <nicstange@gmail.com> wrote:
>> So, are you Ok with only taking the other review comment, namely
>>
>>   "modify efi_fake_memmap() to use your new efi_memmap_alloc() function
>>    for consistency"
>>
>> into account for a v3?
>>
>
> Yes, that should be sufficient for a fix that can be backported.
> Anything else can wait for now


Ard, thanks for your help!

v3 is here:

  http://lkml.kernel.org/r/20170105125130.2815-1-nicstange@gmail.com


Nicolai

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/2] efi: efi_mem_reserve(): don't reserve through memblock after mm_init()
  2017-01-05  9:12   ` Dave Young
@ 2017-01-09 11:44     ` Matt Fleming
  2017-01-09 13:31       ` Mel Gorman
  2017-01-10  0:37       ` Dave Young
  0 siblings, 2 replies; 19+ messages in thread
From: Matt Fleming @ 2017-01-09 11:44 UTC (permalink / raw)
  To: Dave Young
  Cc: Nicolai Stange, Ard Biesheuvel, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, linux-efi, linux-kernel, Mika Penttilä,
	Andrew Morton, linux-mm, Vlastimil Babka, Michal Hocko,
	Mel Gorman

On Thu, 05 Jan, at 05:12:42PM, Dave Young wrote:
> On 12/22/16 at 11:23am, Nicolai Stange wrote:
> > Before invoking the arch specific handler, efi_mem_reserve() reserves
> > the given memory region through memblock.
> > 
> > efi_mem_reserve() can get called after mm_init() though -- through
> > efi_bgrt_init(), for example. After mm_init(), memblock is dead and should
> > not be used anymore.
> 
> It did not fail during previous test so we did not catch this bug, if memblock
> can not be used after mm_init(), IMHO it should fail instead of silently succeed.
 
This must literally be the fifth time or so that I've been caught out
by this over the years because there's no hard error if you call the
memblock code after slab and co. are up.

MM folks, is there some way to catch these errors without requiring
the sprinkling of slab_is_available() everywhere?

> Matt, can we move the efi_mem_reserve to earlier code for example in
> efi_memblock_x86_reserve_range just after reserving the memmap?
 
No, it *needs* to be callable from efi_bgrt_init(), because you only
want to reserve those regions if you have the BGRT driver available.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/2] efi: efi_mem_reserve(): don't reserve through memblock after mm_init()
  2017-01-09 11:44     ` Matt Fleming
@ 2017-01-09 13:31       ` Mel Gorman
  2017-01-09 13:45         ` Matt Fleming
  2017-02-27 21:57         ` Matt Fleming
  2017-01-10  0:37       ` Dave Young
  1 sibling, 2 replies; 19+ messages in thread
From: Mel Gorman @ 2017-01-09 13:31 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Dave Young, Nicolai Stange, Ard Biesheuvel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, x86, linux-efi, linux-kernel,
	Mika Penttilä,
	Andrew Morton, linux-mm, Vlastimil Babka, Michal Hocko

On Mon, Jan 09, 2017 at 11:44:00AM +0000, Matt Fleming wrote:
> On Thu, 05 Jan, at 05:12:42PM, Dave Young wrote:
> > On 12/22/16 at 11:23am, Nicolai Stange wrote:
> > > Before invoking the arch specific handler, efi_mem_reserve() reserves
> > > the given memory region through memblock.
> > > 
> > > efi_mem_reserve() can get called after mm_init() though -- through
> > > efi_bgrt_init(), for example. After mm_init(), memblock is dead and should
> > > not be used anymore.
> > 
> > It did not fail during previous test so we did not catch this bug, if memblock
> > can not be used after mm_init(), IMHO it should fail instead of silently succeed.
>  
> This must literally be the fifth time or so that I've been caught out
> by this over the years because there's no hard error if you call the
> memblock code after slab and co. are up.
> 
> MM folks, is there some way to catch these errors without requiring
> the sprinkling of slab_is_available() everywhere?
> 

Well, you could put in a __init global variable about availability into
mm/memblock.c and then check it in memblock APIs like memblock_reserve()
to BUG_ON? I know BUG_ON is frowned upon but this is not likely to be a
situation that can be sensibly recovered.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/2] efi: efi_mem_reserve(): don't reserve through memblock after mm_init()
  2017-01-09 13:31       ` Mel Gorman
@ 2017-01-09 13:45         ` Matt Fleming
  2017-02-27 21:57         ` Matt Fleming
  1 sibling, 0 replies; 19+ messages in thread
From: Matt Fleming @ 2017-01-09 13:45 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Dave Young, Nicolai Stange, Ard Biesheuvel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, x86, linux-efi, linux-kernel,
	Mika Penttilä,
	Andrew Morton, linux-mm, Vlastimil Babka, Michal Hocko

On Mon, 09 Jan, at 01:31:52PM, Mel Gorman wrote:
> 
> Well, you could put in a __init global variable about availability into
> mm/memblock.c and then check it in memblock APIs like memblock_reserve()
> to BUG_ON? I know BUG_ON is frowned upon but this is not likely to be a
> situation that can be sensibly recovered.

Indeed. I've only ever seen this situation lead to silent memory
corruption and bitter tears.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/2] efi: efi_mem_reserve(): don't reserve through memblock after mm_init()
  2017-01-09 11:44     ` Matt Fleming
  2017-01-09 13:31       ` Mel Gorman
@ 2017-01-10  0:37       ` Dave Young
  2017-01-10 12:51         ` Matt Fleming
  1 sibling, 1 reply; 19+ messages in thread
From: Dave Young @ 2017-01-10  0:37 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Nicolai Stange, Ard Biesheuvel, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, linux-efi, linux-kernel, Mika Penttilä,
	Andrew Morton, linux-mm, Vlastimil Babka, Michal Hocko,
	Mel Gorman

On 01/09/17 at 11:44am, Matt Fleming wrote:
> On Thu, 05 Jan, at 05:12:42PM, Dave Young wrote:
> > On 12/22/16 at 11:23am, Nicolai Stange wrote:
> > > Before invoking the arch specific handler, efi_mem_reserve() reserves
> > > the given memory region through memblock.
> > > 
> > > efi_mem_reserve() can get called after mm_init() though -- through
> > > efi_bgrt_init(), for example. After mm_init(), memblock is dead and should
> > > not be used anymore.
> > 
> > It did not fail during previous test so we did not catch this bug, if memblock
> > can not be used after mm_init(), IMHO it should fail instead of silently succeed.
>  
> This must literally be the fifth time or so that I've been caught out
> by this over the years because there's no hard error if you call the
> memblock code after slab and co. are up.
> 
> MM folks, is there some way to catch these errors without requiring
> the sprinkling of slab_is_available() everywhere?
> 
> > Matt, can we move the efi_mem_reserve to earlier code for example in
> > efi_memblock_x86_reserve_range just after reserving the memmap?
>  
> No, it *needs* to be callable from efi_bgrt_init(), because you only
> want to reserve those regions if you have the BGRT driver available.

It is true that it depends on acpi init, I was wondering if bgrt parsing can
be moved to early acpi code. But anyway I'm not sure it is doable and
worth.

Thanks
Dave 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/2] efi: efi_mem_reserve(): don't reserve through memblock after mm_init()
  2017-01-10  0:37       ` Dave Young
@ 2017-01-10 12:51         ` Matt Fleming
  2017-01-11  8:04           ` Dave Young
  0 siblings, 1 reply; 19+ messages in thread
From: Matt Fleming @ 2017-01-10 12:51 UTC (permalink / raw)
  To: Dave Young
  Cc: Nicolai Stange, Ard Biesheuvel, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, linux-efi, linux-kernel, Mika Penttilä,
	Andrew Morton, linux-mm, Vlastimil Babka, Michal Hocko,
	Mel Gorman

On Tue, 10 Jan, at 08:37:35AM, Dave Young wrote:
> 
> It is true that it depends on acpi init, I was wondering if bgrt parsing can
> be moved to early acpi code. But anyway I'm not sure it is doable and
> worth.

That's a good question. I think I gave up last time I tried to move
the BGRT code to early boot because of the dependencies involved with
having the ACPI table parsing code initialised.

But if you want to take a crack at it, I'd be happy to review the
patches.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/2] efi: efi_mem_reserve(): don't reserve through memblock after mm_init()
  2017-01-10 12:51         ` Matt Fleming
@ 2017-01-11  8:04           ` Dave Young
  0 siblings, 0 replies; 19+ messages in thread
From: Dave Young @ 2017-01-11  8:04 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Nicolai Stange, Ard Biesheuvel, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, x86, linux-efi, linux-kernel, Mika Penttilä,
	Andrew Morton, linux-mm, Vlastimil Babka, Michal Hocko,
	Mel Gorman

On 01/10/17 at 12:51pm, Matt Fleming wrote:
> On Tue, 10 Jan, at 08:37:35AM, Dave Young wrote:
> > 
> > It is true that it depends on acpi init, I was wondering if bgrt parsing can
> > be moved to early acpi code. But anyway I'm not sure it is doable and
> > worth.
> 
> That's a good question. I think I gave up last time I tried to move
> the BGRT code to early boot because of the dependencies involved with
> having the ACPI table parsing code initialised.
> 
> But if you want to take a crack at it, I'd be happy to review the
> patches.

Ok, I will have a try. 

Thanks
Dave

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/2] efi: efi_mem_reserve(): don't reserve through memblock after mm_init()
  2017-01-09 13:31       ` Mel Gorman
  2017-01-09 13:45         ` Matt Fleming
@ 2017-02-27 21:57         ` Matt Fleming
  1 sibling, 0 replies; 19+ messages in thread
From: Matt Fleming @ 2017-02-27 21:57 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Dave Young, Nicolai Stange, Ard Biesheuvel, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, x86, linux-efi, linux-kernel,
	Mika Penttilä,
	Andrew Morton, linux-mm, Vlastimil Babka, Michal Hocko

On Mon, 09 Jan, at 01:31:52PM, Mel Gorman wrote:
> 
> Well, you could put in a __init global variable about availability into
> mm/memblock.c and then check it in memblock APIs like memblock_reserve()
> to BUG_ON? I know BUG_ON is frowned upon but this is not likely to be a
> situation that can be sensibly recovered.

What about something like this?

BUG_ON() shouldn't actually be necessary because I couldn't think of a
situation where A) memblock would be unavailable and B) returning an
error would prevent us from making progress.

---->8----

>From 1c1c06664d23c5d256016918c54e01802af4e891 Mon Sep 17 00:00:00 2001
From: Matt Fleming <matt@codeblueprint.co.uk>
Date: Mon, 27 Feb 2017 21:14:29 +0000
Subject: [PATCH] mm/memblock: Warn if used after slab is up and prevent memory
 corruption

Historically there have been many memory corruption bugs caused by
using the memblock API after its internal data structures have been
destroyed. The latest bug was fixed in commit,

  20b1e22d01a4 ("x86/efi: Don't allocate memmap through memblock after mm_init()")

Instead of modifying the memblock data structures that no longer exist
and silently corrupting memory, WARN and return with an error.

Cc: Nicolai Stange <nicstange@gmail.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
---
 mm/memblock.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/mm/memblock.c b/mm/memblock.c
index 7608bc305936..4dbd86f2fddb 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -530,6 +530,9 @@ int __init_memblock memblock_add_range(struct memblock_type *type,
 	if (!size)
 		return 0;
 
+	if (WARN_ONCE(slab_is_available(), "memblock no longer available\n"))
+		return -EINVAL;
+
 	/* special case for empty array */
 	if (type->regions[0].size == 0) {
 		WARN_ON(type->cnt != 1 || type->total_size);
@@ -648,6 +651,9 @@ static int __init_memblock memblock_isolate_range(struct memblock_type *type,
 	if (!size)
 		return 0;
 
+	if (WARN_ONCE(slab_is_available(), "memblock no longer available\n"))
+		return -EINVAL;
+
 	/* we'll create at most two more regions */
 	while (type->cnt + 2 > type->max)
 		if (memblock_double_array(type, base, size) < 0)
-- 
2.10.0

^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2017-02-28  1:14 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-22 10:23 [PATCH v2 1/2] x86/efi: don't allocate memmap through memblock after mm_init() Nicolai Stange
2016-12-22 10:23 ` [PATCH v2 2/2] efi: efi_mem_reserve(): don't reserve " Nicolai Stange
2017-01-05  9:12   ` Dave Young
2017-01-09 11:44     ` Matt Fleming
2017-01-09 13:31       ` Mel Gorman
2017-01-09 13:45         ` Matt Fleming
2017-02-27 21:57         ` Matt Fleming
2017-01-10  0:37       ` Dave Young
2017-01-10 12:51         ` Matt Fleming
2017-01-11  8:04           ` Dave Young
2016-12-23 14:52 ` [PATCH v2 1/2] x86/efi: don't allocate memmap " Matt Fleming
2016-12-23 21:12   ` Nicolai Stange
2017-01-05  7:42     ` Ingo Molnar
2017-01-05  9:15       ` Dave Young
2017-01-05  9:39       ` Ard Biesheuvel
2017-01-05 10:15         ` Nicolai Stange
2017-01-05 11:34           ` Ard Biesheuvel
2017-01-05 12:53             ` Nicolai Stange
2017-01-04 18:40 ` Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).