All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] remove UEFI reserved regions from the linear mapping
@ 2015-10-29 13:40 Ard Biesheuvel
  2015-10-29 13:40 ` [PATCH 1/3] arm64/efi: set EFI_MEMMAP bit only after mapping the memory map Ard Biesheuvel
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Ard Biesheuvel @ 2015-10-29 13:40 UTC (permalink / raw)
  To: linux-arm-kernel

This is yet another approach to solving the issues around removing RAM
regions known to UEFI from the linear mapping while preserving the record
of the fact that these regions are backed by memory.

The previous approach added a memblock flag called MEMBLOCK_NOMAP to keep
track of RAM regions that should be removed from the linear mapping.

The primary motivation for the new approach is the observation that there
is only a single use case that requires this, which is acpi_os_ioremap().
Since ACPI implies UEFI on arm64 platforms, and since acpi_os_ioremap()
uses page_is_ram() internally (which is a __weak generic function), we
can simply reimplement page_is_ram() to take the UEFI memory map into
account if we are booted via UEFI.

Once we have a page_is_ram() implementation in place that will return true
even for RAM that is known to UEFI but not covered by the linear mapping,
we can remove all UEFI reserved and runtime regions from the linear mapping
as well.

As is obvious from the diffstat, this is the approach with the least impact,
both in terms of number of changes and in terms of the locality of the changes.
If we end up needing this information for other reasons (e.g., /dev/mem access
to /reserved-memory subnodes with the nomap property on !EFI systems), we can
always revisit this, but for now, I think this approach is the most suitable.

Patch #1 slightly reorders the UEFI runtime services initialization routines
so that the EFI_MEMMAP flag is only set if the permanent mapping of the UEFI
memory map is in place.

Patch #2 reimplements page_is_ram() for arm64.

Patch #3 updates the UEFI init code to remove the UEFI runtime regions and the
UEFI memory map from the linear mapping.

Ard Biesheuvel (3):
  arm64/efi: set EFI_MEMMAP bit only after mapping the memory map
  arm64: reimplement page_is_ram() using memblock and UEFI memory map
  arm64/efi: memblock_remove() rather than _reserve UEFI reserved memory

 arch/arm64/kernel/efi.c | 39 ++++++++++----------
 arch/arm64/mm/mmu.c     | 34 +++++++++++++++++
 2 files changed, 54 insertions(+), 19 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3] arm64/efi: set EFI_MEMMAP bit only after mapping the memory map
  2015-10-29 13:40 [PATCH 0/3] remove UEFI reserved regions from the linear mapping Ard Biesheuvel
@ 2015-10-29 13:40 ` Ard Biesheuvel
  2015-11-12 15:14   ` Matt Fleming
  2015-10-29 13:40 ` [PATCH 2/3] arm64: reimplement page_is_ram() using memblock and UEFI " Ard Biesheuvel
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Ard Biesheuvel @ 2015-10-29 13:40 UTC (permalink / raw)
  To: linux-arm-kernel

In an upcoming patch, we will wire up page_is_ram() to the UEFI memory
map, so make sure the EFI_MEMMAP bit accurately reflects whether the
UEFI memory map is available via its permanent mapping.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/efi.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index 852eb927ea85..11b59e9a5954 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -191,8 +191,6 @@ static __init void reserve_regions(void)
 		if (efi_enabled(EFI_DBG))
 			pr_cont("\n");
 	}
-
-	set_bit(EFI_MEMMAP, &efi.flags);
 }
 
 void __init efi_init(void)
@@ -291,13 +289,6 @@ static int __init arm64_enable_runtime_services(void)
 		return -1;
 	}
 
-	if (efi_runtime_disabled()) {
-		pr_info("EFI runtime services will be disabled.\n");
-		return -1;
-	}
-
-	pr_info("Remapping and enabling EFI services.\n");
-
 	mapsize = memmap.map_end - memmap.map;
 	memmap.map = (__force void *)ioremap_cache((phys_addr_t)memmap.phys_map,
 						   mapsize);
@@ -307,6 +298,14 @@ static int __init arm64_enable_runtime_services(void)
 	}
 	memmap.map_end = memmap.map + mapsize;
 	efi.memmap = &memmap;
+	set_bit(EFI_MEMMAP, &efi.flags);
+
+	if (efi_runtime_disabled()) {
+		pr_info("EFI runtime services will be disabled.\n");
+		return -1;
+	}
+
+	pr_info("Remapping and enabling EFI services.\n");
 
 	efi.systab = (__force void *)ioremap_cache(efi_system_table,
 						   sizeof(efi_system_table_t));
-- 
2.1.4

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/3] arm64: reimplement page_is_ram() using memblock and UEFI memory map
  2015-10-29 13:40 [PATCH 0/3] remove UEFI reserved regions from the linear mapping Ard Biesheuvel
  2015-10-29 13:40 ` [PATCH 1/3] arm64/efi: set EFI_MEMMAP bit only after mapping the memory map Ard Biesheuvel
@ 2015-10-29 13:40 ` Ard Biesheuvel
  2015-11-12 15:31   ` Matt Fleming
  2015-10-29 13:40 ` [PATCH 3/3] arm64/efi: memblock_remove() rather than _reserve UEFI reserved memory Ard Biesheuvel
  2015-11-12 15:55 ` [PATCH 0/3] remove UEFI reserved regions from the linear mapping Mark Rutland
  3 siblings, 1 reply; 13+ messages in thread
From: Ard Biesheuvel @ 2015-10-29 13:40 UTC (permalink / raw)
  To: linux-arm-kernel

This patch overrides the __weak default implementation of page_is_ram(),
which uses string comparisons to find entries called 'System RAM' in
/proc/iomem. Since we used the contents of memblock to create those entries
in the first place, let's use memblock directly.

Also, since the UEFI memory map may describe regions backed by RAM that are
not in memblock (i.e., reserved regions that were removed from the linear
mapping), check the pfn against the UEFI memory map as well.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/mm/mmu.c | 34 ++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index c2fa6b56613c..737bfaecb489 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -19,6 +19,7 @@
 
 #include <linux/export.h>
 #include <linux/kernel.h>
+#include <linux/efi.h>
 #include <linux/errno.h>
 #include <linux/init.h>
 #include <linux/libfdt.h>
@@ -31,6 +32,7 @@
 #include <linux/stop_machine.h>
 
 #include <asm/cputype.h>
+#include <asm/efi.h>
 #include <asm/fixmap.h>
 #include <asm/kernel-pgtable.h>
 #include <asm/sections.h>
@@ -743,3 +745,35 @@ void *__init fixmap_remap_fdt(phys_addr_t dt_phys)
 
 	return dt_virt;
 }
+
+/*
+ * On a UEFI system, the memory map may describe regions that are backed by
+ * memory, but are not covered by the linear mapping and so are not listed as
+ * 'System RAM' in /proc/iomem, which is what the default __weak implementation
+ * of page_is_ram looks for. So check the UEFI memory map as well if the pfn is
+ * not covered by memblock.
+ */
+int page_is_ram(unsigned long pfn)
+{
+	u64 addr = PFN_PHYS(pfn);
+	efi_memory_desc_t *md;
+
+	if (memblock_is_memory(addr))
+		return 1;
+
+	if (!efi_enabled(EFI_MEMMAP))
+		return 0;
+
+	/*
+	 * A pfn could intersect multiple regions in the UEFI memory map if the
+	 * OS page size exceeds 4 KB. However, the UEFI spec explicitly forbids
+	 * mixed attribute mappings within the same 64 KB page frame so just use
+	 * the region that intersects the page address.
+	 */
+	for_each_efi_memory_desc(&memmap, md)
+		if (md->phys_addr <= addr &&
+		    (addr - md->phys_addr) < (md->num_pages << EFI_PAGE_SHIFT))
+			return !!(md->attribute & EFI_MEMORY_WB);
+
+	return 0;
+}
-- 
2.1.4

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/3] arm64/efi: memblock_remove() rather than _reserve UEFI reserved memory
  2015-10-29 13:40 [PATCH 0/3] remove UEFI reserved regions from the linear mapping Ard Biesheuvel
  2015-10-29 13:40 ` [PATCH 1/3] arm64/efi: set EFI_MEMMAP bit only after mapping the memory map Ard Biesheuvel
  2015-10-29 13:40 ` [PATCH 2/3] arm64: reimplement page_is_ram() using memblock and UEFI " Ard Biesheuvel
@ 2015-10-29 13:40 ` Ard Biesheuvel
  2015-11-12 15:55 ` [PATCH 0/3] remove UEFI reserved regions from the linear mapping Mark Rutland
  3 siblings, 0 replies; 13+ messages in thread
From: Ard Biesheuvel @ 2015-10-29 13:40 UTC (permalink / raw)
  To: linux-arm-kernel

Memory regions that are not advertised as being available for general use
to the kernel should either never be mapped at all (EfiReservedMemory),
since it should not even be accessed speculatively, or they are mapped
and unmapped explicitly when used (UEFI runtime services regions), so
the linear mapping is never used anyway.

This means we are better off removing these regions from the linear
mapping entirely, so memblock_remove() rather than memblock_reserve()
them.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/efi.c | 22 +++++++++++---------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index 11b59e9a5954..daafa40d1c19 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -171,7 +171,7 @@ static __init void reserve_regions(void)
 		if (efi_enabled(EFI_DBG)) {
 			char buf[64];
 
-			pr_info("  0x%012llx-0x%012llx %s",
+			pr_info("  0x%012llx-0x%012llx %s\n",
 				paddr, paddr + (npages << EFI_PAGE_SHIFT) - 1,
 				efi_md_typeattr_format(buf, sizeof(buf), md));
 		}
@@ -181,15 +181,17 @@ static __init void reserve_regions(void)
 
 		if (is_normal_ram(md))
 			early_init_dt_add_memory_arch(paddr, size);
+	}
 
-		if (is_reserve_region(md)) {
-			memblock_reserve(paddr, size);
-			if (efi_enabled(EFI_DBG))
-				pr_cont("*");
-		}
+	for_each_efi_memory_desc(&memmap, md) {
+		paddr = md->phys_addr;
+		npages = md->num_pages;
+
+		memrange_efi_to_native(&paddr, &npages);
+		size = npages << PAGE_SHIFT;
 
-		if (efi_enabled(EFI_DBG))
-			pr_cont("\n");
+		if (is_reserve_region(md))
+			memblock_remove(paddr, size);
 	}
 }
 
@@ -203,8 +205,6 @@ void __init efi_init(void)
 
 	efi_system_table = params.system_table;
 
-	memblock_reserve(params.mmap & PAGE_MASK,
-			 PAGE_ALIGN(params.mmap_size + (params.mmap & ~PAGE_MASK)));
 	memmap.phys_map = (void *)params.mmap;
 	memmap.map = early_memremap(params.mmap, params.mmap_size);
 	memmap.map_end = memmap.map + params.mmap_size;
@@ -216,6 +216,8 @@ void __init efi_init(void)
 
 	reserve_regions();
 	early_memunmap(memmap.map, params.mmap_size);
+	memblock_remove(params.mmap & PAGE_MASK,
+			PAGE_ALIGN(params.mmap_size + (params.mmap & ~PAGE_MASK)));
 }
 
 static void __init efi_reserve_iomem_resource(efi_memory_desc_t *md)
-- 
2.1.4

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3] arm64/efi: set EFI_MEMMAP bit only after mapping the memory map
  2015-10-29 13:40 ` [PATCH 1/3] arm64/efi: set EFI_MEMMAP bit only after mapping the memory map Ard Biesheuvel
@ 2015-11-12 15:14   ` Matt Fleming
  0 siblings, 0 replies; 13+ messages in thread
From: Matt Fleming @ 2015-11-12 15:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 29 Oct, at 02:40:57PM, Ard Biesheuvel wrote:
> In an upcoming patch, we will wire up page_is_ram() to the UEFI memory
> map, so make sure the EFI_MEMMAP bit accurately reflects whether the
> UEFI memory map is available via its permanent mapping.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/kernel/efi.c | 17 ++++++++---------
>  1 file changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
> index 852eb927ea85..11b59e9a5954 100644
> --- a/arch/arm64/kernel/efi.c
> +++ b/arch/arm64/kernel/efi.c
> @@ -191,8 +191,6 @@ static __init void reserve_regions(void)
>  		if (efi_enabled(EFI_DBG))
>  			pr_cont("\n");
>  	}
> -
> -	set_bit(EFI_MEMMAP, &efi.flags);
>  }
>  
>  void __init efi_init(void)
> @@ -291,13 +289,6 @@ static int __init arm64_enable_runtime_services(void)
>  		return -1;
>  	}
>  
> -	if (efi_runtime_disabled()) {
> -		pr_info("EFI runtime services will be disabled.\n");
> -		return -1;
> -	}
> -
> -	pr_info("Remapping and enabling EFI services.\n");
> -
>  	mapsize = memmap.map_end - memmap.map;
>  	memmap.map = (__force void *)ioremap_cache((phys_addr_t)memmap.phys_map,
>  						   mapsize);
> @@ -307,6 +298,14 @@ static int __init arm64_enable_runtime_services(void)
>  	}
>  	memmap.map_end = memmap.map + mapsize;
>  	efi.memmap = &memmap;
> +	set_bit(EFI_MEMMAP, &efi.flags);
> +
> +	if (efi_runtime_disabled()) {
> +		pr_info("EFI runtime services will be disabled.\n");
> +		return -1;
> +	}
> +
> +	pr_info("Remapping and enabling EFI services.\n");
>  
>  	efi.systab = (__force void *)ioremap_cache(efi_system_table,
>  						   sizeof(efi_system_table_t));

Looks pretty straight forward to me (and a good change).

Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/3] arm64: reimplement page_is_ram() using memblock and UEFI memory map
  2015-10-29 13:40 ` [PATCH 2/3] arm64: reimplement page_is_ram() using memblock and UEFI " Ard Biesheuvel
@ 2015-11-12 15:31   ` Matt Fleming
  2015-11-12 15:40     ` Ard Biesheuvel
  0 siblings, 1 reply; 13+ messages in thread
From: Matt Fleming @ 2015-11-12 15:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 29 Oct, at 02:40:58PM, Ard Biesheuvel wrote:
> This patch overrides the __weak default implementation of page_is_ram(),
> which uses string comparisons to find entries called 'System RAM' in
> /proc/iomem. Since we used the contents of memblock to create those entries
> in the first place, let's use memblock directly.
> 
> Also, since the UEFI memory map may describe regions backed by RAM that are
> not in memblock (i.e., reserved regions that were removed from the linear
> mapping), check the pfn against the UEFI memory map as well.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/mm/mmu.c | 34 ++++++++++++++++++++
>  1 file changed, 34 insertions(+)
 
Am I correct in thinking that the purpose of this series is just to
placate acpi_os_ioremap() on arm64, and its use of page_is_ram()?

While there aren't many users of page_is_ram() right now, I can see
how in the future if new users are added they'd be extremely confused
to find that page_is_ram(pfn) returns true but 'pfn' isn't accessible
by the kernel proper.

Wouldn't it make more sense to teach acpi_os_ioremap() about these
special reserved regions outside of page_is_ram()?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/3] arm64: reimplement page_is_ram() using memblock and UEFI memory map
  2015-11-12 15:31   ` Matt Fleming
@ 2015-11-12 15:40     ` Ard Biesheuvel
  2015-11-12 16:03       ` Mark Rutland
  0 siblings, 1 reply; 13+ messages in thread
From: Ard Biesheuvel @ 2015-11-12 15:40 UTC (permalink / raw)
  To: linux-arm-kernel

On 12 November 2015 at 16:31, Matt Fleming <matt@codeblueprint.co.uk> wrote:
> On Thu, 29 Oct, at 02:40:58PM, Ard Biesheuvel wrote:
>> This patch overrides the __weak default implementation of page_is_ram(),
>> which uses string comparisons to find entries called 'System RAM' in
>> /proc/iomem. Since we used the contents of memblock to create those entries
>> in the first place, let's use memblock directly.
>>
>> Also, since the UEFI memory map may describe regions backed by RAM that are
>> not in memblock (i.e., reserved regions that were removed from the linear
>> mapping), check the pfn against the UEFI memory map as well.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  arch/arm64/mm/mmu.c | 34 ++++++++++++++++++++
>>  1 file changed, 34 insertions(+)
>
> Am I correct in thinking that the purpose of this series is just to
> placate acpi_os_ioremap() on arm64, and its use of page_is_ram()?
>

That is currently the primary user, but we need this information for
other purposes as well. One example is /dev/mem, which is used for
both devices and memory (for instance, tools like dmidecode rely
heavily on it). When using it to access a memory region that we
punched out of the linear mapping, we should typically not map it as a
device, since unaligned accesses cause faults in that case.

In summary, it would be nice if we could preserve the 'is ram"
annotation for regions that are not covered by the linear mapping.

> While there aren't many users of page_is_ram() right now, I can see
> how in the future if new users are added they'd be extremely confused
> to find that page_is_ram(pfn) returns true but 'pfn' isn't accessible
> by the kernel proper.
>

Well, who knows. page_is_ram() is poorly documented, and so is the
'System RAM' iomem annotation that its default implementation relies
on.

> Wouldn't it make more sense to teach acpi_os_ioremap() about these
> special reserved regions outside of page_is_ram()?

Perhaps. But it would introduce EFI dependencies into that code.

The bottom line is that I would like to be able to remove UEFI
occupied or reserved regions from the linear mapping without breaking
ACPI, whose use of page_is_ram() results in alignment faults when
accessing such regions.

-- 
Ard.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 0/3] remove UEFI reserved regions from the linear mapping
  2015-10-29 13:40 [PATCH 0/3] remove UEFI reserved regions from the linear mapping Ard Biesheuvel
                   ` (2 preceding siblings ...)
  2015-10-29 13:40 ` [PATCH 3/3] arm64/efi: memblock_remove() rather than _reserve UEFI reserved memory Ard Biesheuvel
@ 2015-11-12 15:55 ` Mark Rutland
  2015-11-12 16:01   ` Ard Biesheuvel
  3 siblings, 1 reply; 13+ messages in thread
From: Mark Rutland @ 2015-11-12 15:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Oct 29, 2015 at 02:40:56PM +0100, Ard Biesheuvel wrote:
> This is yet another approach to solving the issues around removing RAM
> regions known to UEFI from the linear mapping while preserving the record
> of the fact that these regions are backed by memory.
> 
> The previous approach added a memblock flag called MEMBLOCK_NOMAP to keep
> track of RAM regions that should be removed from the linear mapping.
> 
> The primary motivation for the new approach is the observation that there
> is only a single use case that requires this, which is acpi_os_ioremap().
> Since ACPI implies UEFI on arm64 platforms, and since acpi_os_ioremap()
> uses page_is_ram() internally (which is a __weak generic function), we
> can simply reimplement page_is_ram() to take the UEFI memory map into
> account if we are booted via UEFI.

Just to check, is the above the only reason for this new approach? Or
were there other issues with the MEMBLOCK_NOMAP approach other than the
diffstat?

I quite liked the MEMBLOCK_NOMAP approach as it looked reusable.

> Once we have a page_is_ram() implementation in place that will return true
> even for RAM that is known to UEFI but not covered by the linear mapping,
> we can remove all UEFI reserved and runtime regions from the linear mapping
> as well.

I take it there aren't any lurking instances of page_is_ram() used to
test if something exists in the linear mapping?

> As is obvious from the diffstat, this is the approach with the least impact,
> both in terms of number of changes and in terms of the locality of the changes.
> If we end up needing this information for other reasons (e.g., /dev/mem access
> to /reserved-memory subnodes with the nomap property on !EFI systems), we can
> always revisit this, but for now, I think this approach is the most suitable.
> 
> Patch #1 slightly reorders the UEFI runtime services initialization routines
> so that the EFI_MEMMAP flag is only set if the permanent mapping of the UEFI
> memory map is in place.

This also means that the memory map is mapped even with EFI runtime
support disabled, but I guess that's not a big problem.

As a side thought, it would be nice if we could memremap_ro the system
table and memory map in future to prevent potential corruption, given
they have fixed VAs and are always mapped.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 0/3] remove UEFI reserved regions from the linear mapping
  2015-11-12 15:55 ` [PATCH 0/3] remove UEFI reserved regions from the linear mapping Mark Rutland
@ 2015-11-12 16:01   ` Ard Biesheuvel
  2015-11-12 16:13     ` Mark Rutland
  0 siblings, 1 reply; 13+ messages in thread
From: Ard Biesheuvel @ 2015-11-12 16:01 UTC (permalink / raw)
  To: linux-arm-kernel

On 12 November 2015 at 16:55, Mark Rutland <mark.rutland@arm.com> wrote:
> On Thu, Oct 29, 2015 at 02:40:56PM +0100, Ard Biesheuvel wrote:
>> This is yet another approach to solving the issues around removing RAM
>> regions known to UEFI from the linear mapping while preserving the record
>> of the fact that these regions are backed by memory.
>>
>> The previous approach added a memblock flag called MEMBLOCK_NOMAP to keep
>> track of RAM regions that should be removed from the linear mapping.
>>
>> The primary motivation for the new approach is the observation that there
>> is only a single use case that requires this, which is acpi_os_ioremap().
>> Since ACPI implies UEFI on arm64 platforms, and since acpi_os_ioremap()
>> uses page_is_ram() internally (which is a __weak generic function), we
>> can simply reimplement page_is_ram() to take the UEFI memory map into
>> account if we are booted via UEFI.
>
> Just to check, is the above the only reason for this new approach? Or
> were there other issues with the MEMBLOCK_NOMAP approach other than the
> diffstat?
>
> I quite liked the MEMBLOCK_NOMAP approach as it looked reusable.
>

I think the MEMBLOCK_NOMAP approach is sound, but it is harder to
prove that there are no corner cases that behave incorrectly.

>> Once we have a page_is_ram() implementation in place that will return true
>> even for RAM that is known to UEFI but not covered by the linear mapping,
>> we can remove all UEFI reserved and runtime regions from the linear mapping
>> as well.
>
> I take it there aren't any lurking instances of page_is_ram() used to
> test if something exists in the linear mapping?
>

Well, first of all, the linear mapping only covers lowmem, so that in
itself would not be a portable use. In general, pfn_valid() would be
the correct test for that (possibly combined with PageHighmem())

page_is_ram() and the 'System RAM' iomem region are so poorly defined
or documented that we may be better off just removing it in the first
place and replace it with something meaningful.

>> As is obvious from the diffstat, this is the approach with the least impact,
>> both in terms of number of changes and in terms of the locality of the changes.
>> If we end up needing this information for other reasons (e.g., /dev/mem access
>> to /reserved-memory subnodes with the nomap property on !EFI systems), we can
>> always revisit this, but for now, I think this approach is the most suitable.
>>
>> Patch #1 slightly reorders the UEFI runtime services initialization routines
>> so that the EFI_MEMMAP flag is only set if the permanent mapping of the UEFI
>> memory map is in place.
>
> This also means that the memory map is mapped even with EFI runtime
> support disabled, but I guess that's not a big problem.
>

Yes. You need that anyway if you are going to rely on it even when the
runtime services are disabled.

> As a side thought, it would be nice if we could memremap_ro the system
> table and memory map in future to prevent potential corruption, given
> they have fixed VAs and are always mapped.
>

I agree, and I already have some local patches using
early_memremap_ro() for the EFI init code. memremap_ro() does not
actually exist yet, but I intend to propose MEMREMAP_RO and
MEMREMAP_NX flags to Dan Williams's memremap() work once I get around
to it.

-- 
Ard.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/3] arm64: reimplement page_is_ram() using memblock and UEFI memory map
  2015-11-12 15:40     ` Ard Biesheuvel
@ 2015-11-12 16:03       ` Mark Rutland
  2015-11-12 16:06         ` Ard Biesheuvel
  0 siblings, 1 reply; 13+ messages in thread
From: Mark Rutland @ 2015-11-12 16:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 12, 2015 at 04:40:23PM +0100, Ard Biesheuvel wrote:
> On 12 November 2015 at 16:31, Matt Fleming <matt@codeblueprint.co.uk> wrote:
> > On Thu, 29 Oct, at 02:40:58PM, Ard Biesheuvel wrote:
> >> This patch overrides the __weak default implementation of page_is_ram(),
> >> which uses string comparisons to find entries called 'System RAM' in
> >> /proc/iomem. Since we used the contents of memblock to create those entries
> >> in the first place, let's use memblock directly.
> >>
> >> Also, since the UEFI memory map may describe regions backed by RAM that are
> >> not in memblock (i.e., reserved regions that were removed from the linear
> >> mapping), check the pfn against the UEFI memory map as well.
> >>
> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> >> ---
> >>  arch/arm64/mm/mmu.c | 34 ++++++++++++++++++++
> >>  1 file changed, 34 insertions(+)
> >
> > Am I correct in thinking that the purpose of this series is just to
> > placate acpi_os_ioremap() on arm64, and its use of page_is_ram()?
> >
> 
> That is currently the primary user, but we need this information for
> other purposes as well. One example is /dev/mem, which is used for
> both devices and memory (for instance, tools like dmidecode rely
> heavily on it). When using it to access a memory region that we
> punched out of the linear mapping, we should typically not map it as a
> device, since unaligned accesses cause faults in that case.
> 
> In summary, it would be nice if we could preserve the 'is ram"
> annotation for regions that are not covered by the linear mapping.
> 
> > While there aren't many users of page_is_ram() right now, I can see
> > how in the future if new users are added they'd be extremely confused
> > to find that page_is_ram(pfn) returns true but 'pfn' isn't accessible
> > by the kernel proper.
> >
> 
> Well, who knows. page_is_ram() is poorly documented, and so is the
> 'System RAM' iomem annotation that its default implementation relies
> on.

Sorry if this is a bit of a derailment, but perhaps now is a good
opportunity to introduce something like:

#ifndef page_is_linear_mapped
#define page_is_linear_mapped page_is_ram
#endif

With documentation as to the semantic difference, and a conversion of
existing users.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 2/3] arm64: reimplement page_is_ram() using memblock and UEFI memory map
  2015-11-12 16:03       ` Mark Rutland
@ 2015-11-12 16:06         ` Ard Biesheuvel
  0 siblings, 0 replies; 13+ messages in thread
From: Ard Biesheuvel @ 2015-11-12 16:06 UTC (permalink / raw)
  To: linux-arm-kernel

On 12 November 2015 at 17:03, Mark Rutland <mark.rutland@arm.com> wrote:
> On Thu, Nov 12, 2015 at 04:40:23PM +0100, Ard Biesheuvel wrote:
>> On 12 November 2015 at 16:31, Matt Fleming <matt@codeblueprint.co.uk> wrote:
>> > On Thu, 29 Oct, at 02:40:58PM, Ard Biesheuvel wrote:
>> >> This patch overrides the __weak default implementation of page_is_ram(),
>> >> which uses string comparisons to find entries called 'System RAM' in
>> >> /proc/iomem. Since we used the contents of memblock to create those entries
>> >> in the first place, let's use memblock directly.
>> >>
>> >> Also, since the UEFI memory map may describe regions backed by RAM that are
>> >> not in memblock (i.e., reserved regions that were removed from the linear
>> >> mapping), check the pfn against the UEFI memory map as well.
>> >>
>> >> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> >> ---
>> >>  arch/arm64/mm/mmu.c | 34 ++++++++++++++++++++
>> >>  1 file changed, 34 insertions(+)
>> >
>> > Am I correct in thinking that the purpose of this series is just to
>> > placate acpi_os_ioremap() on arm64, and its use of page_is_ram()?
>> >
>>
>> That is currently the primary user, but we need this information for
>> other purposes as well. One example is /dev/mem, which is used for
>> both devices and memory (for instance, tools like dmidecode rely
>> heavily on it). When using it to access a memory region that we
>> punched out of the linear mapping, we should typically not map it as a
>> device, since unaligned accesses cause faults in that case.
>>
>> In summary, it would be nice if we could preserve the 'is ram"
>> annotation for regions that are not covered by the linear mapping.
>>
>> > While there aren't many users of page_is_ram() right now, I can see
>> > how in the future if new users are added they'd be extremely confused
>> > to find that page_is_ram(pfn) returns true but 'pfn' isn't accessible
>> > by the kernel proper.
>> >
>>
>> Well, who knows. page_is_ram() is poorly documented, and so is the
>> 'System RAM' iomem annotation that its default implementation relies
>> on.
>
> Sorry if this is a bit of a derailment, but perhaps now is a good
> opportunity to introduce something like:
>
> #ifndef page_is_linear_mapped
> #define page_is_linear_mapped page_is_ram
> #endif
>
> With documentation as to the semantic difference, and a conversion of
> existing users.
>

As I replied in the other thread, this does not cover all cases on
highmem platforms.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 0/3] remove UEFI reserved regions from the linear mapping
  2015-11-12 16:01   ` Ard Biesheuvel
@ 2015-11-12 16:13     ` Mark Rutland
  2015-11-12 16:30       ` Ard Biesheuvel
  0 siblings, 1 reply; 13+ messages in thread
From: Mark Rutland @ 2015-11-12 16:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 12, 2015 at 05:01:19PM +0100, Ard Biesheuvel wrote:
> On 12 November 2015 at 16:55, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Thu, Oct 29, 2015 at 02:40:56PM +0100, Ard Biesheuvel wrote:
> >> This is yet another approach to solving the issues around removing RAM
> >> regions known to UEFI from the linear mapping while preserving the record
> >> of the fact that these regions are backed by memory.
> >>
> >> The previous approach added a memblock flag called MEMBLOCK_NOMAP to keep
> >> track of RAM regions that should be removed from the linear mapping.
> >>
> >> The primary motivation for the new approach is the observation that there
> >> is only a single use case that requires this, which is acpi_os_ioremap().
> >> Since ACPI implies UEFI on arm64 platforms, and since acpi_os_ioremap()
> >> uses page_is_ram() internally (which is a __weak generic function), we
> >> can simply reimplement page_is_ram() to take the UEFI memory map into
> >> account if we are booted via UEFI.
> >
> > Just to check, is the above the only reason for this new approach? Or
> > were there other issues with the MEMBLOCK_NOMAP approach other than the
> > diffstat?
> >
> > I quite liked the MEMBLOCK_NOMAP approach as it looked reusable.
> >
> 
> I think the MEMBLOCK_NOMAP approach is sound, but it is harder to
> prove that there are no corner cases that behave incorrectly.

Ok.

> > I take it there aren't any lurking instances of page_is_ram() used to
> > test if something exists in the linear mapping?
> >
> 
> Well, first of all, the linear mapping only covers lowmem, so that in
> itself would not be a portable use. In general, pfn_valid() would be
> the correct test for that (possibly combined with PageHighmem())

Good point, I hadn't considered that.

> page_is_ram() and the 'System RAM' iomem region are so poorly defined
> or documented that we may be better off just removing it in the first
> place and replace it with something meaningful.

I'd be very much in favour of tightening up and/or replacing
page_is_ram with something well-defined.

I believe that some userspace depends on the 'System RAM' info (e.g. I
think kexec tools parse that to decide a good location for the next
kernel), but I would expect that users want to know about _usable_ RAM
rather than anything that happens to be physical RAM.

> >> Patch #1 slightly reorders the UEFI runtime services initialization routines
> >> so that the EFI_MEMMAP flag is only set if the permanent mapping of the UEFI
> >> memory map is in place.
> >
> > This also means that the memory map is mapped even with EFI runtime
> > support disabled, but I guess that's not a big problem.
> >
> 
> Yes. You need that anyway if you are going to rely on it even when the
> runtime services are disabled.

Not with the MEMBLOCK_NOMAP approach.

I don't have a strong case for caring about that; I only imagine that
being a problem if you're trying to track down extremely nasty memory
corruption / bad pointer bugs and want the bare minimum VA space mapped.
Even then the impact is minimal.

> > As a side thought, it would be nice if we could memremap_ro the system
> > table and memory map in future to prevent potential corruption, given
> > they have fixed VAs and are always mapped.
> >
> 
> I agree, and I already have some local patches using
> early_memremap_ro() for the EFI init code.

Ah, nice!

> memremap_ro() does not actually exist yet, but I intend to propose
> MEMREMAP_RO and MEMREMAP_NX flags to Dan Williams's memremap() work
> once I get around to it.

That sounds good; I would certainly be in favour of that.

For some reason I thought the memremap arch changes had gone in for
v4.3, but I see that's not the case. I'll take a look around.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 0/3] remove UEFI reserved regions from the linear mapping
  2015-11-12 16:13     ` Mark Rutland
@ 2015-11-12 16:30       ` Ard Biesheuvel
  0 siblings, 0 replies; 13+ messages in thread
From: Ard Biesheuvel @ 2015-11-12 16:30 UTC (permalink / raw)
  To: linux-arm-kernel

On 12 November 2015 at 17:13, Mark Rutland <mark.rutland@arm.com> wrote:
> On Thu, Nov 12, 2015 at 05:01:19PM +0100, Ard Biesheuvel wrote:
>> On 12 November 2015 at 16:55, Mark Rutland <mark.rutland@arm.com> wrote:
>> > On Thu, Oct 29, 2015 at 02:40:56PM +0100, Ard Biesheuvel wrote:
[...]
>> >> Patch #1 slightly reorders the UEFI runtime services initialization routines
>> >> so that the EFI_MEMMAP flag is only set if the permanent mapping of the UEFI
>> >> memory map is in place.
>> >
>> > This also means that the memory map is mapped even with EFI runtime
>> > support disabled, but I guess that's not a big problem.
>> >
>>
>> Yes. You need that anyway if you are going to rely on it even when the
>> runtime services are disabled.
>
> Not with the MEMBLOCK_NOMAP approach.
>

No, you're right. And actually, it also avoids the whole page_is_ram()
problem, since all regions (including the NOMAP ones) are registered
as 'System RAM' ranges, so in that respect, nothing changes from the
ACPI/page_is_ram() pov

> I don't have a strong case for caring about that; I only imagine that
> being a problem if you're trying to track down extremely nasty memory
> corruption / bad pointer bugs and want the bare minimum VA space mapped.
> Even then the impact is minimal.
>

Indeed. And in general, I am in favour of not relying on the UEFI
memory map at all if we can avoid it, but use memblock as an
intermediate memory map that is equally available on !UEFI boots.

>> > As a side thought, it would be nice if we could memremap_ro the system
>> > table and memory map in future to prevent potential corruption, given
>> > they have fixed VAs and are always mapped.
>> >
>>
>> I agree, and I already have some local patches using
>> early_memremap_ro() for the EFI init code.
>
> Ah, nice!
>
>> memremap_ro() does not actually exist yet, but I intend to propose
>> MEMREMAP_RO and MEMREMAP_NX flags to Dan Williams's memremap() work
>> once I get around to it.
>
> That sounds good; I would certainly be in favour of that.
>
> For some reason I thought the memremap arch changes had gone in for
> v4.3, but I see that's not the case. I'll take a look around.
>

There are some complications, I think. On ARM, ioremap_cache()
disallows remapping of pfn_valid() pages, whereas arm64 short circuits
those to return the linear mapping directly (which, again for highmem
reasons, is not feasible on ARM)

Anyway, I am trying not to make changes on the arm64 side that are
difficult to port to ARM. MEMBLOCK_NOMAP works fine on both sides, so
perhaps I should just revert to using that instead of opening the
page_is_ram() can of worms.

-- 
Ard.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-11-12 16:30 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-29 13:40 [PATCH 0/3] remove UEFI reserved regions from the linear mapping Ard Biesheuvel
2015-10-29 13:40 ` [PATCH 1/3] arm64/efi: set EFI_MEMMAP bit only after mapping the memory map Ard Biesheuvel
2015-11-12 15:14   ` Matt Fleming
2015-10-29 13:40 ` [PATCH 2/3] arm64: reimplement page_is_ram() using memblock and UEFI " Ard Biesheuvel
2015-11-12 15:31   ` Matt Fleming
2015-11-12 15:40     ` Ard Biesheuvel
2015-11-12 16:03       ` Mark Rutland
2015-11-12 16:06         ` Ard Biesheuvel
2015-10-29 13:40 ` [PATCH 3/3] arm64/efi: memblock_remove() rather than _reserve UEFI reserved memory Ard Biesheuvel
2015-11-12 15:55 ` [PATCH 0/3] remove UEFI reserved regions from the linear mapping Mark Rutland
2015-11-12 16:01   ` Ard Biesheuvel
2015-11-12 16:13     ` Mark Rutland
2015-11-12 16:30       ` Ard Biesheuvel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.