All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] arm64/efi: adapt to UEFI 2.5 properties table changes
@ 2015-06-30 10:17 ` Ard Biesheuvel
  0 siblings, 0 replies; 34+ messages in thread
From: Ard Biesheuvel @ 2015-06-30 10:17 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, mark.rutland-5wv7dgnIgG8
  Cc: leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	lersek-H+wXaHxf7aLQT0dZR+AlfA, Ard Biesheuvel

First of all, I am aware that it is not customary to send non-trivial series
during the merge window. However, since a parallel discussion is currently
taking place on the edk2-devel mailing list, I think it makes sense to make
an exception for this series.

Version 2.5 of the UEFI spec introduces a new Properties Table feature
that splits the memory regions covered by PE/COFF executable images
into regions with the appropriate permissions for the underlying segment
(i.e., RuntimeServicesCode/R-X for .text and RuntimeServiceData/rw- for
.data)

Unfortunately, this feature is built on the backwards incompatible assumption
that the OS always maps all RuntimeServicesCode and RuntimeServiceData regions
in a way that keeps adjacent code and data regions adjacent. Since this is
not what we are currently doing for arm64, some changes are required.

The first patch makes the mapping permission logic compliant with the spec,
by mapping all RuntimeServicesCode *and* RuntimeServicesData regions RWX,
(formerly, we were using RW- for data regions), unless any of the
EFI_MEMORY_RO and EFI_MEMORY_XP attributes are set, and the region is fully
aligned to the page size (which may not always be the case on 64k pages)

Then, in patch #2, we change the virtual remapping logic to keep adjacent
EFI_MEMORY_RUNTIME regions together. This requires us to sort the incoming
memory map, since the UEFI spec does not guarantee that it is sorted (although
it usually is).

This series applies on top of the patch that introduces the EFI_MEMORY_RO
region attribute, which can be found here:
http://article.gmane.org/gmane.linux.kernel.efi/5819

Ard Biesheuvel (2):
  arm64/efi: base UEFI mapping permissions on region attributes
  arm64/efi: don't pad between EFI_MEMORY_RUNTIME regions

 arch/arm64/kernel/efi.c                 | 32 +++++++----
 drivers/firmware/efi/libstub/arm-stub.c | 58 +++++++++++++++-----
 2 files changed, 64 insertions(+), 26 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 0/2] arm64/efi: adapt to UEFI 2.5 properties table changes
@ 2015-06-30 10:17 ` Ard Biesheuvel
  0 siblings, 0 replies; 34+ messages in thread
From: Ard Biesheuvel @ 2015-06-30 10:17 UTC (permalink / raw)
  To: linux-arm-kernel

First of all, I am aware that it is not customary to send non-trivial series
during the merge window. However, since a parallel discussion is currently
taking place on the edk2-devel mailing list, I think it makes sense to make
an exception for this series.

Version 2.5 of the UEFI spec introduces a new Properties Table feature
that splits the memory regions covered by PE/COFF executable images
into regions with the appropriate permissions for the underlying segment
(i.e., RuntimeServicesCode/R-X for .text and RuntimeServiceData/rw- for
.data)

Unfortunately, this feature is built on the backwards incompatible assumption
that the OS always maps all RuntimeServicesCode and RuntimeServiceData regions
in a way that keeps adjacent code and data regions adjacent. Since this is
not what we are currently doing for arm64, some changes are required.

The first patch makes the mapping permission logic compliant with the spec,
by mapping all RuntimeServicesCode *and* RuntimeServicesData regions RWX,
(formerly, we were using RW- for data regions), unless any of the
EFI_MEMORY_RO and EFI_MEMORY_XP attributes are set, and the region is fully
aligned to the page size (which may not always be the case on 64k pages)

Then, in patch #2, we change the virtual remapping logic to keep adjacent
EFI_MEMORY_RUNTIME regions together. This requires us to sort the incoming
memory map, since the UEFI spec does not guarantee that it is sorted (although
it usually is).

This series applies on top of the patch that introduces the EFI_MEMORY_RO
region attribute, which can be found here:
http://article.gmane.org/gmane.linux.kernel.efi/5819

Ard Biesheuvel (2):
  arm64/efi: base UEFI mapping permissions on region attributes
  arm64/efi: don't pad between EFI_MEMORY_RUNTIME regions

 arch/arm64/kernel/efi.c                 | 32 +++++++----
 drivers/firmware/efi/libstub/arm-stub.c | 58 +++++++++++++++-----
 2 files changed, 64 insertions(+), 26 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 1/2] arm64/efi: base UEFI mapping permissions on region attributes
  2015-06-30 10:17 ` Ard Biesheuvel
@ 2015-06-30 10:17     ` Ard Biesheuvel
  -1 siblings, 0 replies; 34+ messages in thread
From: Ard Biesheuvel @ 2015-06-30 10:17 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, mark.rutland-5wv7dgnIgG8
  Cc: leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	lersek-H+wXaHxf7aLQT0dZR+AlfA, Ard Biesheuvel

Currently, we infer the UEFI memory region mapping permissions
from the memory region type (i.e., runtime services code are
mapped RWX and runtime services data mapped RW-). This appears to
work fine but is not entirely UEFI spec compliant. So instead, use
the designated permission attributes to decide how these regions
should be mapped.

Since UEFIv2.5 introduces a new EFI_MEMORY_RO permission attribute,
and redefines EFI_MEMORY_WP as a cacheability attribute, use only
the former as a read-only attribute. For setting the PXN bit, the
corresponding EFI_MEMORY_XP attribute is used.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 arch/arm64/kernel/efi.c | 32 +++++++++++++-------
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index ab21e0d58278..5dcab58d5d30 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -247,20 +247,30 @@ static bool __init efi_virtmap_init(void)
 		memrange_efi_to_native(&paddr, &npages);
 		size = npages << PAGE_SHIFT;
 
-		pr_info("  EFI remap 0x%016llx => %p\n",
-			md->phys_addr, (void *)md->virt_addr);
-
-		/*
-		 * Only regions of type EFI_RUNTIME_SERVICES_CODE need to be
-		 * executable, everything else can be mapped with the XN bits
-		 * set.
-		 */
 		if (!is_normal_ram(md))
 			prot = __pgprot(PROT_DEVICE_nGnRE);
-		else if (md->type == EFI_RUNTIME_SERVICES_CODE)
-			prot = PAGE_KERNEL_EXEC;
 		else
-			prot = PAGE_KERNEL;
+			prot = PAGE_KERNEL_EXEC;
+
+		/*
+		 * On 64 KB granule kernels, only use strict permissions when
+		 * the region does not share a 64 KB page frame with another
+		 * region at either end.
+		 */
+		if (!IS_ENABLED(CONFIG_ARM64_64K_PAGES) ||
+		    !(md->virt_addr % PAGE_SIZE ||
+		      (md->phys_addr + md->num_pages * EFI_PAGE_SIZE) % PAGE_SIZE)) {
+
+			if (md->attribute & EFI_MEMORY_RO)
+				prot |= __pgprot(PTE_RDONLY);
+			if (md->attribute & EFI_MEMORY_XP)
+				prot |= __pgprot(PTE_PXN);
+		}
+
+		pr_info("  EFI remap 0x%016llx => %p (R%c%c)\n",
+			md->phys_addr, (void *)md->virt_addr,
+			prot & __pgprot(PTE_RDONLY) ? '-' : 'W',
+			prot & __pgprot(PTE_PXN) ? '-' : 'X');
 
 		create_pgd_mapping(&efi_mm, paddr, md->virt_addr, size, prot);
 	}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 1/2] arm64/efi: base UEFI mapping permissions on region attributes
@ 2015-06-30 10:17     ` Ard Biesheuvel
  0 siblings, 0 replies; 34+ messages in thread
From: Ard Biesheuvel @ 2015-06-30 10:17 UTC (permalink / raw)
  To: linux-arm-kernel

Currently, we infer the UEFI memory region mapping permissions
from the memory region type (i.e., runtime services code are
mapped RWX and runtime services data mapped RW-). This appears to
work fine but is not entirely UEFI spec compliant. So instead, use
the designated permission attributes to decide how these regions
should be mapped.

Since UEFIv2.5 introduces a new EFI_MEMORY_RO permission attribute,
and redefines EFI_MEMORY_WP as a cacheability attribute, use only
the former as a read-only attribute. For setting the PXN bit, the
corresponding EFI_MEMORY_XP attribute is used.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 arch/arm64/kernel/efi.c | 32 +++++++++++++-------
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index ab21e0d58278..5dcab58d5d30 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -247,20 +247,30 @@ static bool __init efi_virtmap_init(void)
 		memrange_efi_to_native(&paddr, &npages);
 		size = npages << PAGE_SHIFT;
 
-		pr_info("  EFI remap 0x%016llx => %p\n",
-			md->phys_addr, (void *)md->virt_addr);
-
-		/*
-		 * Only regions of type EFI_RUNTIME_SERVICES_CODE need to be
-		 * executable, everything else can be mapped with the XN bits
-		 * set.
-		 */
 		if (!is_normal_ram(md))
 			prot = __pgprot(PROT_DEVICE_nGnRE);
-		else if (md->type == EFI_RUNTIME_SERVICES_CODE)
-			prot = PAGE_KERNEL_EXEC;
 		else
-			prot = PAGE_KERNEL;
+			prot = PAGE_KERNEL_EXEC;
+
+		/*
+		 * On 64 KB granule kernels, only use strict permissions when
+		 * the region does not share a 64 KB page frame with another
+		 * region at either end.
+		 */
+		if (!IS_ENABLED(CONFIG_ARM64_64K_PAGES) ||
+		    !(md->virt_addr % PAGE_SIZE ||
+		      (md->phys_addr + md->num_pages * EFI_PAGE_SIZE) % PAGE_SIZE)) {
+
+			if (md->attribute & EFI_MEMORY_RO)
+				prot |= __pgprot(PTE_RDONLY);
+			if (md->attribute & EFI_MEMORY_XP)
+				prot |= __pgprot(PTE_PXN);
+		}
+
+		pr_info("  EFI remap 0x%016llx => %p (R%c%c)\n",
+			md->phys_addr, (void *)md->virt_addr,
+			prot & __pgprot(PTE_RDONLY) ? '-' : 'W',
+			prot & __pgprot(PTE_PXN) ? '-' : 'X');
 
 		create_pgd_mapping(&efi_mm, paddr, md->virt_addr, size, prot);
 	}
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 2/2] arm64/efi: don't pad between EFI_MEMORY_RUNTIME regions
  2015-06-30 10:17 ` Ard Biesheuvel
@ 2015-06-30 10:17     ` Ard Biesheuvel
  -1 siblings, 0 replies; 34+ messages in thread
From: Ard Biesheuvel @ 2015-06-30 10:17 UTC (permalink / raw)
  To: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, mark.rutland-5wv7dgnIgG8
  Cc: leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, msalter-H+wXaHxf7aLQT0dZR+AlfA,
	lersek-H+wXaHxf7aLQT0dZR+AlfA, Ard Biesheuvel

The new Properties Table feature introduced in UEFIv2.5 may split
memory regions that cover PE/COFF memory images into separate code
and data regions.

Since the relative offset of PE/COFF .text and .data segments cannot
be changed on the fly, this means that we can no longer pad out those
regions to be mappable using 64 KB pages.
Unfortunately, there is no annotation in the UEFI memory map that
identifies data regions that were split off from a code region, so we
must apply this logic to all runtime code and data regions.

So instead of rounding each memory region to 64 KB alignment at both
ends, only round down regions that are not directly preceded by another
runtime region. Since the UEFI spec does not mandate that the memory map
be sorted, this means we also need to sort it first.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
 drivers/firmware/efi/libstub/arm-stub.c | 58 +++++++++++++++-----
 1 file changed, 43 insertions(+), 15 deletions(-)

diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
index e29560e6b40b..dd9e2addc9fb 100644
--- a/drivers/firmware/efi/libstub/arm-stub.c
+++ b/drivers/firmware/efi/libstub/arm-stub.c
@@ -13,6 +13,7 @@
  */
 
 #include <linux/efi.h>
+#include <linux/sort.h>
 #include <asm/efi.h>
 
 #include "efistub.h"
@@ -305,6 +306,13 @@ fail:
  */
 #define EFI_RT_VIRTUAL_BASE	0x40000000
 
+static int cmp_mem_desc(const void *a, const void *b)
+{
+	efi_memory_desc_t const *left = a, *right = b;
+
+	return (left->phys_addr > right->phys_addr) ? 1 : -1;
+}
+
 /*
  * efi_get_virtmap() - create a virtual mapping for the EFI memory map
  *
@@ -317,33 +325,53 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
 		     int *count)
 {
 	u64 efi_virt_base = EFI_RT_VIRTUAL_BASE;
-	efi_memory_desc_t *out = runtime_map;
+	efi_memory_desc_t *in, *prev = NULL, *out = runtime_map;
 	int l;
 
-	for (l = 0; l < map_size; l += desc_size) {
-		efi_memory_desc_t *in = (void *)memory_map + l;
+	/*
+	 * To work around potential issues with the Properties Table feature
+	 * introduced in UEFI 2.5 (which may split PE/COFF memory images
+	 * into several RuntimeServicesCode and RuntimeServicesData regions
+	 * whose relative offset in memory needs to be retained), we need to
+	 * sort the memory map before traversing it, and avoid padding out those
+	 * regions to 64 KB granularity.
+	 */
+	sort(memory_map, map_size / desc_size, desc_size, cmp_mem_desc, NULL);
+
+	for (l = 0; l < map_size; l += desc_size, prev = in) {
 		u64 paddr, size;
 
+		in = (void *)memory_map + l;
 		if (!(in->attribute & EFI_MEMORY_RUNTIME))
 			continue;
 
+		paddr = in->phys_addr;
+		size = in->num_pages * EFI_PAGE_SIZE;
+
 		/*
 		 * Make the mapping compatible with 64k pages: this allows
 		 * a 4k page size kernel to kexec a 64k page size kernel and
 		 * vice versa.
 		 */
-		paddr = round_down(in->phys_addr, SZ_64K);
-		size = round_up(in->num_pages * EFI_PAGE_SIZE +
-				in->phys_addr - paddr, SZ_64K);
-
-		/*
-		 * Avoid wasting memory on PTEs by choosing a virtual base that
-		 * is compatible with section mappings if this region has the
-		 * appropriate size and physical alignment. (Sections are 2 MB
-		 * on 4k granule kernels)
-		 */
-		if (IS_ALIGNED(in->phys_addr, SZ_2M) && size >= SZ_2M)
-			efi_virt_base = round_up(efi_virt_base, SZ_2M);
+		if (!prev ||
+		    !(paddr == (prev->phys_addr +
+				prev->num_pages * EFI_PAGE_SIZE) &&
+		      (prev->attribute & EFI_MEMORY_RUNTIME))) {
+
+			paddr = round_down(in->phys_addr, SZ_64K);
+			size += in->phys_addr - paddr;
+
+			/*
+			 * Avoid wasting memory on PTEs by choosing a virtual
+			 * base that is compatible with section mappings if this
+			 * region has the appropriate size and physical
+			 * alignment. (Sections are 2 MB on 4k granule kernels)
+			 */
+			if (IS_ALIGNED(in->phys_addr, SZ_2M) && size >= SZ_2M)
+				efi_virt_base = round_up(efi_virt_base, SZ_2M);
+			else
+				efi_virt_base = round_up(efi_virt_base, SZ_64K);
+		}
 
 		in->virt_addr = efi_virt_base + in->phys_addr - paddr;
 		efi_virt_base += size;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 2/2] arm64/efi: don't pad between EFI_MEMORY_RUNTIME regions
@ 2015-06-30 10:17     ` Ard Biesheuvel
  0 siblings, 0 replies; 34+ messages in thread
From: Ard Biesheuvel @ 2015-06-30 10:17 UTC (permalink / raw)
  To: linux-arm-kernel

The new Properties Table feature introduced in UEFIv2.5 may split
memory regions that cover PE/COFF memory images into separate code
and data regions.

Since the relative offset of PE/COFF .text and .data segments cannot
be changed on the fly, this means that we can no longer pad out those
regions to be mappable using 64 KB pages.
Unfortunately, there is no annotation in the UEFI memory map that
identifies data regions that were split off from a code region, so we
must apply this logic to all runtime code and data regions.

So instead of rounding each memory region to 64 KB alignment at both
ends, only round down regions that are not directly preceded by another
runtime region. Since the UEFI spec does not mandate that the memory map
be sorted, this means we also need to sort it first.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
---
 drivers/firmware/efi/libstub/arm-stub.c | 58 +++++++++++++++-----
 1 file changed, 43 insertions(+), 15 deletions(-)

diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
index e29560e6b40b..dd9e2addc9fb 100644
--- a/drivers/firmware/efi/libstub/arm-stub.c
+++ b/drivers/firmware/efi/libstub/arm-stub.c
@@ -13,6 +13,7 @@
  */
 
 #include <linux/efi.h>
+#include <linux/sort.h>
 #include <asm/efi.h>
 
 #include "efistub.h"
@@ -305,6 +306,13 @@ fail:
  */
 #define EFI_RT_VIRTUAL_BASE	0x40000000
 
+static int cmp_mem_desc(const void *a, const void *b)
+{
+	efi_memory_desc_t const *left = a, *right = b;
+
+	return (left->phys_addr > right->phys_addr) ? 1 : -1;
+}
+
 /*
  * efi_get_virtmap() - create a virtual mapping for the EFI memory map
  *
@@ -317,33 +325,53 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
 		     int *count)
 {
 	u64 efi_virt_base = EFI_RT_VIRTUAL_BASE;
-	efi_memory_desc_t *out = runtime_map;
+	efi_memory_desc_t *in, *prev = NULL, *out = runtime_map;
 	int l;
 
-	for (l = 0; l < map_size; l += desc_size) {
-		efi_memory_desc_t *in = (void *)memory_map + l;
+	/*
+	 * To work around potential issues with the Properties Table feature
+	 * introduced in UEFI 2.5 (which may split PE/COFF memory images
+	 * into several RuntimeServicesCode and RuntimeServicesData regions
+	 * whose relative offset in memory needs to be retained), we need to
+	 * sort the memory map before traversing it, and avoid padding out those
+	 * regions to 64 KB granularity.
+	 */
+	sort(memory_map, map_size / desc_size, desc_size, cmp_mem_desc, NULL);
+
+	for (l = 0; l < map_size; l += desc_size, prev = in) {
 		u64 paddr, size;
 
+		in = (void *)memory_map + l;
 		if (!(in->attribute & EFI_MEMORY_RUNTIME))
 			continue;
 
+		paddr = in->phys_addr;
+		size = in->num_pages * EFI_PAGE_SIZE;
+
 		/*
 		 * Make the mapping compatible with 64k pages: this allows
 		 * a 4k page size kernel to kexec a 64k page size kernel and
 		 * vice versa.
 		 */
-		paddr = round_down(in->phys_addr, SZ_64K);
-		size = round_up(in->num_pages * EFI_PAGE_SIZE +
-				in->phys_addr - paddr, SZ_64K);
-
-		/*
-		 * Avoid wasting memory on PTEs by choosing a virtual base that
-		 * is compatible with section mappings if this region has the
-		 * appropriate size and physical alignment. (Sections are 2 MB
-		 * on 4k granule kernels)
-		 */
-		if (IS_ALIGNED(in->phys_addr, SZ_2M) && size >= SZ_2M)
-			efi_virt_base = round_up(efi_virt_base, SZ_2M);
+		if (!prev ||
+		    !(paddr == (prev->phys_addr +
+				prev->num_pages * EFI_PAGE_SIZE) &&
+		      (prev->attribute & EFI_MEMORY_RUNTIME))) {
+
+			paddr = round_down(in->phys_addr, SZ_64K);
+			size += in->phys_addr - paddr;
+
+			/*
+			 * Avoid wasting memory on PTEs by choosing a virtual
+			 * base that is compatible with section mappings if this
+			 * region has the appropriate size and physical
+			 * alignment. (Sections are 2 MB on 4k granule kernels)
+			 */
+			if (IS_ALIGNED(in->phys_addr, SZ_2M) && size >= SZ_2M)
+				efi_virt_base = round_up(efi_virt_base, SZ_2M);
+			else
+				efi_virt_base = round_up(efi_virt_base, SZ_64K);
+		}
 
 		in->virt_addr = efi_virt_base + in->phys_addr - paddr;
 		efi_virt_base += size;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/2] arm64/efi: base UEFI mapping permissions on region attributes
  2015-06-30 10:17     ` Ard Biesheuvel
@ 2015-06-30 14:50         ` Mark Salter
  -1 siblings, 0 replies; 34+ messages in thread
From: Mark Salter @ 2015-06-30 14:50 UTC (permalink / raw)
  To: Ard Biesheuvel,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA,
	matt.fleming-ral2JQCrhuEAvxtiuMwx3w, mark.rutland-5wv7dgnIgG8
  Cc: leif.lindholm-QSEj5FYQhm4dnm+yROfE0A,
	roy.franz-QSEj5FYQhm4dnm+yROfE0A, lersek-H+wXaHxf7aLQT0dZR+AlfA

On Tue, 2015-06-30 at 12:17 +0200, Ard Biesheuvel wrote:

> Currently, we infer the UEFI memory region mapping permissions
> from the memory region type (i.e., runtime services code are
> mapped RWX and runtime services data mapped RW-). This appears to
> work fine but is not entirely UEFI spec compliant. So instead, use
> the designated permission attributes to decide how these regions
> should be mapped.
> 
> Since UEFIv2.5 introduces a new EFI_MEMORY_RO permission attribute,
> and redefines EFI_MEMORY_WP as a cacheability attribute, use only
> the former as a read-only attribute. For setting the PXN bit, the
> corresponding EFI_MEMORY_XP attribute is used.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
> ---
>  arch/arm64/kernel/efi.c | 32 +++++++++++++-------
>  1 file changed, 21 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
> index ab21e0d58278..5dcab58d5d30 100644
> --- a/arch/arm64/kernel/efi.c
> +++ b/arch/arm64/kernel/efi.c
> @@ -247,20 +247,30 @@ static bool __init efi_virtmap_init(void)
>  		memrange_efi_to_native(&paddr, &npages);
>  		size = npages << PAGE_SHIFT;
>  
> -		pr_info("  EFI remap 0x%016llx => %p\n",
> -			md->phys_addr, (void *)md->virt_addr);
> -
> -		/*
> -		 * Only regions of type EFI_RUNTIME_SERVICES_CODE need to be
> -		 * executable, everything else can be mapped with the XN bits
> -		 * set.
> -		 */
>  		if (!is_normal_ram(md))
>  			prot = __pgprot(PROT_DEVICE_nGnRE);
> -		else if (md->type == EFI_RUNTIME_SERVICES_CODE)
> -			prot = PAGE_KERNEL_EXEC;
>  		else
> -			prot = PAGE_KERNEL;
> +			prot = PAGE_KERNEL_EXEC;
> +
> +		/*
> +		 * On 64 KB granule kernels, only use strict permissions when
> +		 * the region does not share a 64 KB page frame with another
> +		 * region at either end.
> +		 */
> +		if (!IS_ENABLED(CONFIG_ARM64_64K_PAGES) ||
> +		    !(md->virt_addr % PAGE_SIZE ||
> +		      (md->phys_addr + md->num_pages * EFI_PAGE_SIZE) % PAGE_SIZE)) {

I think this would read easier with:

		    (PAGE_ALIGNED(md->virt_addr) &&
		      PAGE_ALIGNED(md->phys_addr + md->num_pages * EFI_PAGE_SIZE))) {

> +
> +			if (md->attribute & EFI_MEMORY_RO)
> +				prot |= __pgprot(PTE_RDONLY);
> +			if (md->attribute & EFI_MEMORY_XP)
> +				prot |= __pgprot(PTE_PXN);
> +		}
> +
> +		pr_info("  EFI remap 0x%016llx => %p (R%c%c)\n",
> +			md->phys_addr, (void *)md->virt_addr,
> +			prot & __pgprot(PTE_RDONLY) ? '-' : 'W',
> +			prot & __pgprot(PTE_PXN) ? '-' : 'X');

You can't maninulate pgprot_t directly like that. It will
break STRICT_MM_TYPECHECKS. You need to use __pgprot_modify()
and/or pgprot_val().

arch/arm64/kernel/efi.c: In function ‘efi_virtmap_init’:
arch/arm64/kernel/efi.c:266:10: error: invalid operands to binary | (have ‘pgprot_t’ and ‘pgprot_t’)
     prot |= __pgprot(PTE_RDONLY);
          ^
   ...
   
(In trying that, I see there are a number of other places which
need some STRICT_MM_TYPECHECKS fixing)

>  
>  		create_pgd_mapping(&efi_mm, paddr, md->virt_addr, size, prot);
>  	}

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 1/2] arm64/efi: base UEFI mapping permissions on region attributes
@ 2015-06-30 14:50         ` Mark Salter
  0 siblings, 0 replies; 34+ messages in thread
From: Mark Salter @ 2015-06-30 14:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 2015-06-30 at 12:17 +0200, Ard Biesheuvel wrote:

> Currently, we infer the UEFI memory region mapping permissions
> from the memory region type (i.e., runtime services code are
> mapped RWX and runtime services data mapped RW-). This appears to
> work fine but is not entirely UEFI spec compliant. So instead, use
> the designated permission attributes to decide how these regions
> should be mapped.
> 
> Since UEFIv2.5 introduces a new EFI_MEMORY_RO permission attribute,
> and redefines EFI_MEMORY_WP as a cacheability attribute, use only
> the former as a read-only attribute. For setting the PXN bit, the
> corresponding EFI_MEMORY_XP attribute is used.
> 
> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> ---
>  arch/arm64/kernel/efi.c | 32 +++++++++++++-------
>  1 file changed, 21 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
> index ab21e0d58278..5dcab58d5d30 100644
> --- a/arch/arm64/kernel/efi.c
> +++ b/arch/arm64/kernel/efi.c
> @@ -247,20 +247,30 @@ static bool __init efi_virtmap_init(void)
>  		memrange_efi_to_native(&paddr, &npages);
>  		size = npages << PAGE_SHIFT;
>  
> -		pr_info("  EFI remap 0x%016llx => %p\n",
> -			md->phys_addr, (void *)md->virt_addr);
> -
> -		/*
> -		 * Only regions of type EFI_RUNTIME_SERVICES_CODE need to be
> -		 * executable, everything else can be mapped with the XN bits
> -		 * set.
> -		 */
>  		if (!is_normal_ram(md))
>  			prot = __pgprot(PROT_DEVICE_nGnRE);
> -		else if (md->type == EFI_RUNTIME_SERVICES_CODE)
> -			prot = PAGE_KERNEL_EXEC;
>  		else
> -			prot = PAGE_KERNEL;
> +			prot = PAGE_KERNEL_EXEC;
> +
> +		/*
> +		 * On 64 KB granule kernels, only use strict permissions when
> +		 * the region does not share a 64 KB page frame with another
> +		 * region at either end.
> +		 */
> +		if (!IS_ENABLED(CONFIG_ARM64_64K_PAGES) ||
> +		    !(md->virt_addr % PAGE_SIZE ||
> +		      (md->phys_addr + md->num_pages * EFI_PAGE_SIZE) % PAGE_SIZE)) {

I think this would read easier with:

		    (PAGE_ALIGNED(md->virt_addr) &&
		      PAGE_ALIGNED(md->phys_addr + md->num_pages * EFI_PAGE_SIZE))) {

> +
> +			if (md->attribute & EFI_MEMORY_RO)
> +				prot |= __pgprot(PTE_RDONLY);
> +			if (md->attribute & EFI_MEMORY_XP)
> +				prot |= __pgprot(PTE_PXN);
> +		}
> +
> +		pr_info("  EFI remap 0x%016llx => %p (R%c%c)\n",
> +			md->phys_addr, (void *)md->virt_addr,
> +			prot & __pgprot(PTE_RDONLY) ? '-' : 'W',
> +			prot & __pgprot(PTE_PXN) ? '-' : 'X');

You can't maninulate pgprot_t directly like that. It will
break STRICT_MM_TYPECHECKS. You need to use __pgprot_modify()
and/or pgprot_val().

arch/arm64/kernel/efi.c: In function ?efi_virtmap_init?:
arch/arm64/kernel/efi.c:266:10: error: invalid operands to binary | (have ?pgprot_t? and ?pgprot_t?)
     prot |= __pgprot(PTE_RDONLY);
          ^
   ...
   
(In trying that, I see there are a number of other places which
need some STRICT_MM_TYPECHECKS fixing)

>  
>  		create_pgd_mapping(&efi_mm, paddr, md->virt_addr, size, prot);
>  	}

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/2] arm64/efi: base UEFI mapping permissions on region attributes
  2015-06-30 14:50         ` Mark Salter
@ 2015-06-30 14:53             ` Ard Biesheuvel
  -1 siblings, 0 replies; 34+ messages in thread
From: Ard Biesheuvel @ 2015-06-30 14:53 UTC (permalink / raw)
  To: Mark Salter
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Matt Fleming, Mark Rutland,
	Leif Lindholm, Roy Franz, Laszlo Ersek

On 30 June 2015 at 16:50, Mark Salter <msalter-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
> On Tue, 2015-06-30 at 12:17 +0200, Ard Biesheuvel wrote:
>
>> Currently, we infer the UEFI memory region mapping permissions
>> from the memory region type (i.e., runtime services code are
>> mapped RWX and runtime services data mapped RW-). This appears to
>> work fine but is not entirely UEFI spec compliant. So instead, use
>> the designated permission attributes to decide how these regions
>> should be mapped.
>>
>> Since UEFIv2.5 introduces a new EFI_MEMORY_RO permission attribute,
>> and redefines EFI_MEMORY_WP as a cacheability attribute, use only
>> the former as a read-only attribute. For setting the PXN bit, the
>> corresponding EFI_MEMORY_XP attribute is used.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>> ---
>>  arch/arm64/kernel/efi.c | 32 +++++++++++++-------
>>  1 file changed, 21 insertions(+), 11 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
>> index ab21e0d58278..5dcab58d5d30 100644
>> --- a/arch/arm64/kernel/efi.c
>> +++ b/arch/arm64/kernel/efi.c
>> @@ -247,20 +247,30 @@ static bool __init efi_virtmap_init(void)
>>               memrange_efi_to_native(&paddr, &npages);
>>               size = npages << PAGE_SHIFT;
>>
>> -             pr_info("  EFI remap 0x%016llx => %p\n",
>> -                     md->phys_addr, (void *)md->virt_addr);
>> -
>> -             /*
>> -              * Only regions of type EFI_RUNTIME_SERVICES_CODE need to be
>> -              * executable, everything else can be mapped with the XN bits
>> -              * set.
>> -              */
>>               if (!is_normal_ram(md))
>>                       prot = __pgprot(PROT_DEVICE_nGnRE);
>> -             else if (md->type == EFI_RUNTIME_SERVICES_CODE)
>> -                     prot = PAGE_KERNEL_EXEC;
>>               else
>> -                     prot = PAGE_KERNEL;
>> +                     prot = PAGE_KERNEL_EXEC;
>> +
>> +             /*
>> +              * On 64 KB granule kernels, only use strict permissions when
>> +              * the region does not share a 64 KB page frame with another
>> +              * region at either end.
>> +              */
>> +             if (!IS_ENABLED(CONFIG_ARM64_64K_PAGES) ||
>> +                 !(md->virt_addr % PAGE_SIZE ||
>> +                   (md->phys_addr + md->num_pages * EFI_PAGE_SIZE) % PAGE_SIZE)) {
>
> I think this would read easier with:
>
>                     (PAGE_ALIGNED(md->virt_addr) &&
>                       PAGE_ALIGNED(md->phys_addr + md->num_pages * EFI_PAGE_SIZE))) {
>

Yes, good point, I will change that.

>> +
>> +                     if (md->attribute & EFI_MEMORY_RO)
>> +                             prot |= __pgprot(PTE_RDONLY);
>> +                     if (md->attribute & EFI_MEMORY_XP)
>> +                             prot |= __pgprot(PTE_PXN);
>> +             }
>> +
>> +             pr_info("  EFI remap 0x%016llx => %p (R%c%c)\n",
>> +                     md->phys_addr, (void *)md->virt_addr,
>> +                     prot & __pgprot(PTE_RDONLY) ? '-' : 'W',
>> +                     prot & __pgprot(PTE_PXN) ? '-' : 'X');
>
> You can't maninulate pgprot_t directly like that. It will
> break STRICT_MM_TYPECHECKS. You need to use __pgprot_modify()
> and/or pgprot_val().
>
> arch/arm64/kernel/efi.c: In function ‘efi_virtmap_init’:
> arch/arm64/kernel/efi.c:266:10: error: invalid operands to binary | (have ‘pgprot_t’ and ‘pgprot_t’)
>      prot |= __pgprot(PTE_RDONLY);
>           ^
>    ...
>
> (In trying that, I see there are a number of other places which
> need some STRICT_MM_TYPECHECKS fixing)
>

Actually, I had 'prot |= PTE_RDONLY' but then changed it to the above
thinking that it would pass the strict type checks, but apparently not
:-)

I will fix that up as well.

Thanks,
Ard.


>>
>>               create_pgd_mapping(&efi_mm, paddr, md->virt_addr, size, prot);
>>       }
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 1/2] arm64/efi: base UEFI mapping permissions on region attributes
@ 2015-06-30 14:53             ` Ard Biesheuvel
  0 siblings, 0 replies; 34+ messages in thread
From: Ard Biesheuvel @ 2015-06-30 14:53 UTC (permalink / raw)
  To: linux-arm-kernel

On 30 June 2015 at 16:50, Mark Salter <msalter@redhat.com> wrote:
> On Tue, 2015-06-30 at 12:17 +0200, Ard Biesheuvel wrote:
>
>> Currently, we infer the UEFI memory region mapping permissions
>> from the memory region type (i.e., runtime services code are
>> mapped RWX and runtime services data mapped RW-). This appears to
>> work fine but is not entirely UEFI spec compliant. So instead, use
>> the designated permission attributes to decide how these regions
>> should be mapped.
>>
>> Since UEFIv2.5 introduces a new EFI_MEMORY_RO permission attribute,
>> and redefines EFI_MEMORY_WP as a cacheability attribute, use only
>> the former as a read-only attribute. For setting the PXN bit, the
>> corresponding EFI_MEMORY_XP attribute is used.
>>
>> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>> ---
>>  arch/arm64/kernel/efi.c | 32 +++++++++++++-------
>>  1 file changed, 21 insertions(+), 11 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
>> index ab21e0d58278..5dcab58d5d30 100644
>> --- a/arch/arm64/kernel/efi.c
>> +++ b/arch/arm64/kernel/efi.c
>> @@ -247,20 +247,30 @@ static bool __init efi_virtmap_init(void)
>>               memrange_efi_to_native(&paddr, &npages);
>>               size = npages << PAGE_SHIFT;
>>
>> -             pr_info("  EFI remap 0x%016llx => %p\n",
>> -                     md->phys_addr, (void *)md->virt_addr);
>> -
>> -             /*
>> -              * Only regions of type EFI_RUNTIME_SERVICES_CODE need to be
>> -              * executable, everything else can be mapped with the XN bits
>> -              * set.
>> -              */
>>               if (!is_normal_ram(md))
>>                       prot = __pgprot(PROT_DEVICE_nGnRE);
>> -             else if (md->type == EFI_RUNTIME_SERVICES_CODE)
>> -                     prot = PAGE_KERNEL_EXEC;
>>               else
>> -                     prot = PAGE_KERNEL;
>> +                     prot = PAGE_KERNEL_EXEC;
>> +
>> +             /*
>> +              * On 64 KB granule kernels, only use strict permissions when
>> +              * the region does not share a 64 KB page frame with another
>> +              * region at either end.
>> +              */
>> +             if (!IS_ENABLED(CONFIG_ARM64_64K_PAGES) ||
>> +                 !(md->virt_addr % PAGE_SIZE ||
>> +                   (md->phys_addr + md->num_pages * EFI_PAGE_SIZE) % PAGE_SIZE)) {
>
> I think this would read easier with:
>
>                     (PAGE_ALIGNED(md->virt_addr) &&
>                       PAGE_ALIGNED(md->phys_addr + md->num_pages * EFI_PAGE_SIZE))) {
>

Yes, good point, I will change that.

>> +
>> +                     if (md->attribute & EFI_MEMORY_RO)
>> +                             prot |= __pgprot(PTE_RDONLY);
>> +                     if (md->attribute & EFI_MEMORY_XP)
>> +                             prot |= __pgprot(PTE_PXN);
>> +             }
>> +
>> +             pr_info("  EFI remap 0x%016llx => %p (R%c%c)\n",
>> +                     md->phys_addr, (void *)md->virt_addr,
>> +                     prot & __pgprot(PTE_RDONLY) ? '-' : 'W',
>> +                     prot & __pgprot(PTE_PXN) ? '-' : 'X');
>
> You can't maninulate pgprot_t directly like that. It will
> break STRICT_MM_TYPECHECKS. You need to use __pgprot_modify()
> and/or pgprot_val().
>
> arch/arm64/kernel/efi.c: In function ?efi_virtmap_init?:
> arch/arm64/kernel/efi.c:266:10: error: invalid operands to binary | (have ?pgprot_t? and ?pgprot_t?)
>      prot |= __pgprot(PTE_RDONLY);
>           ^
>    ...
>
> (In trying that, I see there are a number of other places which
> need some STRICT_MM_TYPECHECKS fixing)
>

Actually, I had 'prot |= PTE_RDONLY' but then changed it to the above
thinking that it would pass the strict type checks, but apparently not
:-)

I will fix that up as well.

Thanks,
Ard.


>>
>>               create_pgd_mapping(&efi_mm, paddr, md->virt_addr, size, prot);
>>       }
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
  2015-09-30  1:03         ` H. Peter Anvin
  2015-09-30  1:16           ` Andy Lutomirski
@ 2015-10-01 10:44           ` Ingo Molnar
  1 sibling, 0 replies; 34+ messages in thread
From: Ingo Molnar @ 2015-10-01 10:44 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Ard Biesheuvel, Matt Fleming, Thomas Gleixner, linux-kernel,
	linux-efi, Leif Lindholm, Catalin Marinas, Will Deacon, stable,
	Matt Fleming, Mark Rutland, Mark Salter, Linus Torvalds,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Denys Vlasenko,
	Brian Gerst


* H. Peter Anvin <hpa@zytor.com> wrote:

> On 09/27/2015 12:06 AM, Ingo Molnar wrote:
> > 
> > * Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> > 
> >>> If we allocate the EFI runtime as a single virtual memory block then issues 
> >>> like rounding between sections does not even come up as a problem: we map the 
> >>> original offsets and sizes byte by byte.
> >>
> >> Well, by that reasoning, we should not call SetVirtualAddressMap() in the first 
> >> place, and just use the 1:1 mapping UEFI uses natively. This is more than 
> >> feasible on arm64, and I actually fought hard against using 
> >> SetVirtualAddressMap() at all, but I was overruled by others. I think this is 
> >> also trivially possible on X64, since the 1:1 mapping is already active 
> >> alongside the VA mapping.
> > 
> > Could we please re-list all the arguments pro and contra of 1:1 physical mappings, 
> > in a post that also explains the background so that more people can chime in, not 
> > just people versed in EFI internals? It's very much possible that a bad decision 
> > was made.
> > 
> 
> Pro: by far the sanest way to map the UEFI tables.
> Con: doesn't actually work (breaks on several known platforms.)

You knew this next question was coming: in what way does it break on known 
platforms?

I.e. do those platforms require a SetVirtualAddressMap() call and break if one 
does not come?

Note that there's 3 models possible:

 - pure 1:1
 - 1:1 plus offset, with SetVirtualAddressMap(offset)
 - bottom up allocator

I don't think we want 'pure' 1:1 physical/virtual (for security reasons, etc.).

So the question is, in what way does our current proposed bottom-up allocator 
differ from 1:1 plus offset? My impression is that they are mostly identical.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
@ 2015-09-30  8:33                 ` Borislav Petkov
  0 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2015-09-30  8:33 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Matt Fleming, Ingo Molnar, Ard Biesheuvel, Thomas Gleixner,
	linux-kernel, linux-efi, Leif Lindholm, Catalin Marinas,
	Will Deacon, stable, Matt Fleming, Mark Rutland, Mark Salter,
	Linus Torvalds, Andrew Morton, Andy Lutomirski, Denys Vlasenko,
	Brian Gerst

On Tue, Sep 29, 2015 at 05:56:07PM -0700, H. Peter Anvin wrote:
> On 09/29/2015 07:36 AM, Matt Fleming wrote:
> >  
> > That's a pretty good summary for x86. I think specifically the reason
> > we map the EFI memmap entries "backwards" (entry N has higher VA than
> > entry N+1) is because the code was easier to write that way, but
> > you'll know better than me ;-)
> > 
> 
> There were two reasons:
> 
> 1. The code was easier to write.
> 2. Windows did it that way.
> 
> Windows apparently broke and was changed due to this feature, too.

So can we do the 1:1 thing again?

I mean, we do create a special pagetable for EFI anyway, we can put in
there whatever we want.

I know, some apple boxes reportedly fail when 1:1 mapping is in use but
we can do the VA mapping as a workaround for them. I.e., have the 1:1
mapping be the default.

Apparently, there's not a single OS or tool which is used by fw writers
to test their brain dumplings. Windoze breakage case-in-point.

Because if there were, we'd simply do what that OS/tool does and be done
with it.

What really makes me climb the walls is when half-cooked, untested fw
hits the wild and we have to support it indefinitely.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
@ 2015-09-30  8:33                 ` Borislav Petkov
  0 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2015-09-30  8:33 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Matt Fleming, Ingo Molnar, Ard Biesheuvel, Thomas Gleixner,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Leif Lindholm, Catalin Marinas,
	Will Deacon, stable-u79uwXL29TY76Z2rM5mHXA, Matt Fleming,
	Mark Rutland, Mark Salter, Linus Torvalds, Andrew Morton,
	Andy Lutomirski, Denys Vlasenko, Brian Gerst

On Tue, Sep 29, 2015 at 05:56:07PM -0700, H. Peter Anvin wrote:
> On 09/29/2015 07:36 AM, Matt Fleming wrote:
> >  
> > That's a pretty good summary for x86. I think specifically the reason
> > we map the EFI memmap entries "backwards" (entry N has higher VA than
> > entry N+1) is because the code was easier to write that way, but
> > you'll know better than me ;-)
> > 
> 
> There were two reasons:
> 
> 1. The code was easier to write.
> 2. Windows did it that way.
> 
> Windows apparently broke and was changed due to this feature, too.

So can we do the 1:1 thing again?

I mean, we do create a special pagetable for EFI anyway, we can put in
there whatever we want.

I know, some apple boxes reportedly fail when 1:1 mapping is in use but
we can do the VA mapping as a workaround for them. I.e., have the 1:1
mapping be the default.

Apparently, there's not a single OS or tool which is used by fw writers
to test their brain dumplings. Windoze breakage case-in-point.

Because if there were, we'd simply do what that OS/tool does and be done
with it.

What really makes me climb the walls is when half-cooked, untested fw
hits the wild and we have to support it indefinitely.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
@ 2015-09-30  4:24               ` Ard Biesheuvel
  0 siblings, 0 replies; 34+ messages in thread
From: Ard Biesheuvel @ 2015-09-30  4:24 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: H. Peter Anvin, Ingo Molnar, Matt Fleming, Thomas Gleixner,
	linux-kernel, linux-efi, Leif Lindholm, Catalin Marinas,
	Will Deacon, stable, Matt Fleming, Mark Rutland, Mark Salter,
	Linus Torvalds, Andrew Morton, Andy Lutomirski, Borislav Petkov,
	Denys Vlasenko, Brian Gerst

On 30 September 2015 at 03:16, Andy Lutomirski <luto@amacapital.net> wrote:
> On Tue, Sep 29, 2015 at 6:03 PM, H. Peter Anvin <hpa@zytor.com> wrote:
>> On 09/27/2015 12:06 AM, Ingo Molnar wrote:
>>>
>>> * Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>>>
>>>>> If we allocate the EFI runtime as a single virtual memory block then issues
>>>>> like rounding between sections does not even come up as a problem: we map the
>>>>> original offsets and sizes byte by byte.
>>>>
>>>> Well, by that reasoning, we should not call SetVirtualAddressMap() in the first
>>>> place, and just use the 1:1 mapping UEFI uses natively. This is more than
>>>> feasible on arm64, and I actually fought hard against using
>>>> SetVirtualAddressMap() at all, but I was overruled by others. I think this is
>>>> also trivially possible on X64, since the 1:1 mapping is already active
>>>> alongside the VA mapping.
>>>
>>> Could we please re-list all the arguments pro and contra of 1:1 physical mappings,
>>> in a post that also explains the background so that more people can chime in, not
>>> just people versed in EFI internals? It's very much possible that a bad decision
>>> was made.
>>>
>>
>> Pro: by far the sanest way to map the UEFI tables.
>> Con: doesn't actually work (breaks on several known platforms.)
>
> Can we at least do 1:1 without an offset on arm64?  Given that Linux
> seems to be more of a reference on arm64 than Windows is, maybe that
> would give everyone something vaguely sane to work with.
>

Yes, as I mentioned before in this thread, on arm64 this is very much
feasible, and it was my strong preference all along. However, the
arguments made by others that outweighed my preference were
(a) x86 uses it
(b) if we don't use it now, we will never be able to start using it
later since it will undoubtedly be broken in /some/ implementation in
the field.

As I also mentioned, the only minor complication is that arm64's VA
space may be configured to be smaller than the physical base of DRAM,
but I already had to address that for the boot ID map and KVM as well.

I will cook up a patch later today.

-- 
Ard.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
@ 2015-09-30  4:24               ` Ard Biesheuvel
  0 siblings, 0 replies; 34+ messages in thread
From: Ard Biesheuvel @ 2015-09-30  4:24 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: H. Peter Anvin, Ingo Molnar, Matt Fleming, Thomas Gleixner,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Leif Lindholm, Catalin Marinas,
	Will Deacon, stable-u79uwXL29TY76Z2rM5mHXA, Matt Fleming,
	Mark Rutland, Mark Salter, Linus Torvalds, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Denys Vlasenko, Brian Gerst

On 30 September 2015 at 03:16, Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org> wrote:
> On Tue, Sep 29, 2015 at 6:03 PM, H. Peter Anvin <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org> wrote:
>> On 09/27/2015 12:06 AM, Ingo Molnar wrote:
>>>
>>> * Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:
>>>
>>>>> If we allocate the EFI runtime as a single virtual memory block then issues
>>>>> like rounding between sections does not even come up as a problem: we map the
>>>>> original offsets and sizes byte by byte.
>>>>
>>>> Well, by that reasoning, we should not call SetVirtualAddressMap() in the first
>>>> place, and just use the 1:1 mapping UEFI uses natively. This is more than
>>>> feasible on arm64, and I actually fought hard against using
>>>> SetVirtualAddressMap() at all, but I was overruled by others. I think this is
>>>> also trivially possible on X64, since the 1:1 mapping is already active
>>>> alongside the VA mapping.
>>>
>>> Could we please re-list all the arguments pro and contra of 1:1 physical mappings,
>>> in a post that also explains the background so that more people can chime in, not
>>> just people versed in EFI internals? It's very much possible that a bad decision
>>> was made.
>>>
>>
>> Pro: by far the sanest way to map the UEFI tables.
>> Con: doesn't actually work (breaks on several known platforms.)
>
> Can we at least do 1:1 without an offset on arm64?  Given that Linux
> seems to be more of a reference on arm64 than Windows is, maybe that
> would give everyone something vaguely sane to work with.
>

Yes, as I mentioned before in this thread, on arm64 this is very much
feasible, and it was my strong preference all along. However, the
arguments made by others that outweighed my preference were
(a) x86 uses it
(b) if we don't use it now, we will never be able to start using it
later since it will undoubtedly be broken in /some/ implementation in
the field.

As I also mentioned, the only minor complication is that arm64's VA
space may be configured to be smaller than the physical base of DRAM,
but I already had to address that for the boot ID map and KVM as well.

I will cook up a patch later today.

-- 
Ard.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
  2015-09-30  1:16           ` Andy Lutomirski
@ 2015-09-30  1:19             ` H. Peter Anvin
  2015-09-30  4:24               ` Ard Biesheuvel
  1 sibling, 0 replies; 34+ messages in thread
From: H. Peter Anvin @ 2015-09-30  1:19 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Ingo Molnar, Ard Biesheuvel, Matt Fleming, Thomas Gleixner,
	linux-kernel, linux-efi, Leif Lindholm, Catalin Marinas,
	Will Deacon, stable, Matt Fleming, Mark Rutland, Mark Salter,
	Linus Torvalds, Andrew Morton, Andy Lutomirski, Borislav Petkov,
	Denys Vlasenko, Brian Gerst

On 09/29/2015 06:16 PM, Andy Lutomirski wrote:
> 
> Can we at least do 1:1 without an offset on arm64?  Given that Linux
> seems to be more of a reference on arm64 than Windows is, maybe that
> would give everyone something vaguely sane to work with.
> 

I have no idea.  That's a question for the ARM folks.

	-hpa



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
  2015-09-30  1:03         ` H. Peter Anvin
@ 2015-09-30  1:16           ` Andy Lutomirski
  2015-09-30  1:19             ` H. Peter Anvin
  2015-09-30  4:24               ` Ard Biesheuvel
  2015-10-01 10:44           ` Ingo Molnar
  1 sibling, 2 replies; 34+ messages in thread
From: Andy Lutomirski @ 2015-09-30  1:16 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Ingo Molnar, Ard Biesheuvel, Matt Fleming, Thomas Gleixner,
	linux-kernel, linux-efi, Leif Lindholm, Catalin Marinas,
	Will Deacon, stable, Matt Fleming, Mark Rutland, Mark Salter,
	Linus Torvalds, Andrew Morton, Andy Lutomirski, Borislav Petkov,
	Denys Vlasenko, Brian Gerst

On Tue, Sep 29, 2015 at 6:03 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 09/27/2015 12:06 AM, Ingo Molnar wrote:
>>
>> * Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>>
>>>> If we allocate the EFI runtime as a single virtual memory block then issues
>>>> like rounding between sections does not even come up as a problem: we map the
>>>> original offsets and sizes byte by byte.
>>>
>>> Well, by that reasoning, we should not call SetVirtualAddressMap() in the first
>>> place, and just use the 1:1 mapping UEFI uses natively. This is more than
>>> feasible on arm64, and I actually fought hard against using
>>> SetVirtualAddressMap() at all, but I was overruled by others. I think this is
>>> also trivially possible on X64, since the 1:1 mapping is already active
>>> alongside the VA mapping.
>>
>> Could we please re-list all the arguments pro and contra of 1:1 physical mappings,
>> in a post that also explains the background so that more people can chime in, not
>> just people versed in EFI internals? It's very much possible that a bad decision
>> was made.
>>
>
> Pro: by far the sanest way to map the UEFI tables.
> Con: doesn't actually work (breaks on several known platforms.)

Can we at least do 1:1 without an offset on arm64?  Given that Linux
seems to be more of a reference on arm64 than Windows is, maybe that
would give everyone something vaguely sane to work with.

--Andy

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
  2015-09-27  7:06         ` Ingo Molnar
  (?)
  (?)
@ 2015-09-30  1:03         ` H. Peter Anvin
  2015-09-30  1:16           ` Andy Lutomirski
  2015-10-01 10:44           ` Ingo Molnar
  -1 siblings, 2 replies; 34+ messages in thread
From: H. Peter Anvin @ 2015-09-30  1:03 UTC (permalink / raw)
  To: Ingo Molnar, Ard Biesheuvel
  Cc: Matt Fleming, Thomas Gleixner, linux-kernel, linux-efi,
	Leif Lindholm, Catalin Marinas, Will Deacon, stable,
	Matt Fleming, Mark Rutland, Mark Salter, Linus Torvalds,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Denys Vlasenko,
	Brian Gerst

On 09/27/2015 12:06 AM, Ingo Molnar wrote:
> 
> * Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> 
>>> If we allocate the EFI runtime as a single virtual memory block then issues 
>>> like rounding between sections does not even come up as a problem: we map the 
>>> original offsets and sizes byte by byte.
>>
>> Well, by that reasoning, we should not call SetVirtualAddressMap() in the first 
>> place, and just use the 1:1 mapping UEFI uses natively. This is more than 
>> feasible on arm64, and I actually fought hard against using 
>> SetVirtualAddressMap() at all, but I was overruled by others. I think this is 
>> also trivially possible on X64, since the 1:1 mapping is already active 
>> alongside the VA mapping.
> 
> Could we please re-list all the arguments pro and contra of 1:1 physical mappings, 
> in a post that also explains the background so that more people can chime in, not 
> just people versed in EFI internals? It's very much possible that a bad decision 
> was made.
> 

Pro: by far the sanest way to map the UEFI tables.
Con: doesn't actually work (breaks on several known platforms.)

	-hpa



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
@ 2015-09-30  0:56               ` H. Peter Anvin
  0 siblings, 0 replies; 34+ messages in thread
From: H. Peter Anvin @ 2015-09-30  0:56 UTC (permalink / raw)
  To: Matt Fleming, Borislav Petkov
  Cc: Ingo Molnar, Ard Biesheuvel, Thomas Gleixner, linux-kernel,
	linux-efi, Leif Lindholm, Catalin Marinas, Will Deacon, stable,
	Matt Fleming, Mark Rutland, Mark Salter, Linus Torvalds,
	Andrew Morton, Andy Lutomirski, Denys Vlasenko, Brian Gerst

On 09/29/2015 07:36 AM, Matt Fleming wrote:
>  
> That's a pretty good summary for x86. I think specifically the reason
> we map the EFI memmap entries "backwards" (entry N has higher VA than
> entry N+1) is because the code was easier to write that way, but
> you'll know better than me ;-)
> 

There were two reasons:

1. The code was easier to write.
2. Windows did it that way.

Windows apparently broke and was changed due to this feature, too.

	-hpa



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
@ 2015-09-30  0:56               ` H. Peter Anvin
  0 siblings, 0 replies; 34+ messages in thread
From: H. Peter Anvin @ 2015-09-30  0:56 UTC (permalink / raw)
  To: Matt Fleming, Borislav Petkov
  Cc: Ingo Molnar, Ard Biesheuvel, Thomas Gleixner,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Leif Lindholm, Catalin Marinas,
	Will Deacon, stable-u79uwXL29TY76Z2rM5mHXA, Matt Fleming,
	Mark Rutland, Mark Salter, Linus Torvalds, Andrew Morton,
	Andy Lutomirski, Denys Vlasenko, Brian Gerst

On 09/29/2015 07:36 AM, Matt Fleming wrote:
>  
> That's a pretty good summary for x86. I think specifically the reason
> we map the EFI memmap entries "backwards" (entry N has higher VA than
> entry N+1) is because the code was easier to write that way, but
> you'll know better than me ;-)
> 

There were two reasons:

1. The code was easier to write.
2. Windows did it that way.

Windows apparently broke and was changed due to this feature, too.

	-hpa

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
@ 2015-09-29 14:36             ` Matt Fleming
  0 siblings, 0 replies; 34+ messages in thread
From: Matt Fleming @ 2015-09-29 14:36 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Ard Biesheuvel, Thomas Gleixner, H. Peter Anvin,
	linux-kernel, linux-efi, Leif Lindholm, Catalin Marinas,
	Will Deacon, stable, Matt Fleming, Mark Rutland, Mark Salter,
	Linus Torvalds, Andrew Morton, Andy Lutomirski, Denys Vlasenko,
	Brian Gerst

On Sun, 27 Sep, at 12:40:14PM, Borislav Petkov wrote:
> On Sun, Sep 27, 2015 at 09:06:44AM +0200, Ingo Molnar wrote:
> > Could we please re-list all the arguments pro and contra of 1:1 physical mappings, 
> > in a post that also explains the background so that more people can chime in, not 
> > just people versed in EFI internals? It's very much possible that a bad decision 
> > was made.
> 
> The main reason why we did the additional, top-down mapping was kexec
> kernel wanting to use UEFI runtime facilities too and the braindead
> design of SetVirtualAddressMap() being callable only once per system
> boot. So we had to have stable mappings which are valid in the kexec-ed
> kernel too.
> 
> But this was long time ago and I most certainly have forgotten all the
> details.
 
That's a pretty good summary for x86. I think specifically the reason
we map the EFI memmap entries "backwards" (entry N has higher VA than
entry N+1) is because the code was easier to write that way, but
you'll know better than me ;-)

> And now I'm wondering why didn't we do the 1:1 thing and rebuild the
> exact same EFI pagetable in the kexec-ed kernel? Because when we do
> an EFI call, we switch to the special pagetable so why didn't we make
> the kexec-ed kernel rebuild the 1:1 pagetable which it can use for EFI
> calls...
> 
> Hmm, again, I've forgotten a lot of details so I'm sure Matt will come
> in and say "No, you can't do that because..."

I *think* the only reason was the Apple firmware problem where it
explodes if you pass the 1:1 mappings to SetVirtualAddressMap(). And
obviously people do want to use kexec with Apple machines.

It's probably worth revisiting this whole thing from the x86 side.

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
@ 2015-09-29 14:36             ` Matt Fleming
  0 siblings, 0 replies; 34+ messages in thread
From: Matt Fleming @ 2015-09-29 14:36 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Ard Biesheuvel, Thomas Gleixner, H. Peter Anvin,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Leif Lindholm, Catalin Marinas,
	Will Deacon, stable-u79uwXL29TY76Z2rM5mHXA, Matt Fleming,
	Mark Rutland, Mark Salter, Linus Torvalds, Andrew Morton,
	Andy Lutomirski, Denys Vlasenko, Brian Gerst

On Sun, 27 Sep, at 12:40:14PM, Borislav Petkov wrote:
> On Sun, Sep 27, 2015 at 09:06:44AM +0200, Ingo Molnar wrote:
> > Could we please re-list all the arguments pro and contra of 1:1 physical mappings, 
> > in a post that also explains the background so that more people can chime in, not 
> > just people versed in EFI internals? It's very much possible that a bad decision 
> > was made.
> 
> The main reason why we did the additional, top-down mapping was kexec
> kernel wanting to use UEFI runtime facilities too and the braindead
> design of SetVirtualAddressMap() being callable only once per system
> boot. So we had to have stable mappings which are valid in the kexec-ed
> kernel too.
> 
> But this was long time ago and I most certainly have forgotten all the
> details.
 
That's a pretty good summary for x86. I think specifically the reason
we map the EFI memmap entries "backwards" (entry N has higher VA than
entry N+1) is because the code was easier to write that way, but
you'll know better than me ;-)

> And now I'm wondering why didn't we do the 1:1 thing and rebuild the
> exact same EFI pagetable in the kexec-ed kernel? Because when we do
> an EFI call, we switch to the special pagetable so why didn't we make
> the kexec-ed kernel rebuild the 1:1 pagetable which it can use for EFI
> calls...
> 
> Hmm, again, I've forgotten a lot of details so I'm sure Matt will come
> in and say "No, you can't do that because..."

I *think* the only reason was the Apple firmware problem where it
explodes if you pass the 1:1 mappings to SetVirtualAddressMap(). And
obviously people do want to use kexec with Apple machines.

It's probably worth revisiting this whole thing from the x86 side.

-- 
Matt Fleming, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
  2015-09-29  9:31           ` Dave Young
@ 2015-09-29 10:24             ` Borislav Petkov
  0 siblings, 0 replies; 34+ messages in thread
From: Borislav Petkov @ 2015-09-29 10:24 UTC (permalink / raw)
  To: Dave Young
  Cc: Ingo Molnar, Ard Biesheuvel, Matt Fleming, Thomas Gleixner,
	H. Peter Anvin, linux-kernel, linux-efi, Leif Lindholm,
	Catalin Marinas, Will Deacon, stable, Matt Fleming, Mark Rutland,
	Mark Salter, Linus Torvalds, Andrew Morton, Andy Lutomirski,
	Denys Vlasenko, Brian Gerst

On Tue, Sep 29, 2015 at 05:31:00PM +0800, Dave Young wrote:
> http://permalink.gmane.org/gmane.linux.kernel.efi/822
> 
> And more replies here:
> http://comments.gmane.org/gmane.linux.kernel.efi/820

Whatever we did think back then, it all is moot as long as some UEFI
vendors do funky crap or the spec has documentation holes.

So we need a reality check and just do exactly what the OS does with
which UEFI writers are testing with. Be it windoze, be it some open
firmware testing tool or whatever. Provided there's even a tool with
which UEFI validation is done and it is used by everybody...

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
  2015-09-27 10:40         ` Borislav Petkov
  2015-09-28  6:20           ` Ingo Molnar
@ 2015-09-29  9:31           ` Dave Young
  2015-09-29 10:24             ` Borislav Petkov
  2015-09-29 14:36             ` Matt Fleming
  2 siblings, 1 reply; 34+ messages in thread
From: Dave Young @ 2015-09-29  9:31 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ingo Molnar, Ard Biesheuvel, Matt Fleming, Thomas Gleixner,
	H. Peter Anvin, linux-kernel, linux-efi, Leif Lindholm,
	Catalin Marinas, Will Deacon, stable, Matt Fleming, Mark Rutland,
	Mark Salter, Linus Torvalds, Andrew Morton, Andy Lutomirski,
	Denys Vlasenko, Brian Gerst

On 09/27/15 at 12:40pm, Borislav Petkov wrote:
> On Sun, Sep 27, 2015 at 09:06:44AM +0200, Ingo Molnar wrote:
> > Could we please re-list all the arguments pro and contra of 1:1 physical mappings, 
> > in a post that also explains the background so that more people can chime in, not 
> > just people versed in EFI internals? It's very much possible that a bad decision 
> > was made.
> 
> The main reason why we did the additional, top-down mapping was kexec
> kernel wanting to use UEFI runtime facilities too and the braindead
> design of SetVirtualAddressMap() being callable only once per system
> boot. So we had to have stable mappings which are valid in the kexec-ed
> kernel too.
> 
> But this was long time ago and I most certainly have forgotten all the
> details.
> 
> And now I'm wondering why didn't we do the 1:1 thing and rebuild the
> exact same EFI pagetable in the kexec-ed kernel? Because when we do
> an EFI call, we switch to the special pagetable so why didn't we make
> the kexec-ed kernel rebuild the 1:1 pagetable which it can use for EFI
> calls...
> 
> Hmm, again, I've forgotten a lot of details so I'm sure Matt will come
> in and say "No, you can't do that because..."
> 

http://permalink.gmane.org/gmane.linux.kernel.efi/822

And more replies here:
http://comments.gmane.org/gmane.linux.kernel.efi/820

Thanks
Dave

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
  2015-09-27 10:40         ` Borislav Petkov
@ 2015-09-28  6:20           ` Ingo Molnar
  2015-09-29  9:31           ` Dave Young
  2015-09-29 14:36             ` Matt Fleming
  2 siblings, 0 replies; 34+ messages in thread
From: Ingo Molnar @ 2015-09-28  6:20 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Ard Biesheuvel, Matt Fleming, Thomas Gleixner, H. Peter Anvin,
	linux-kernel, linux-efi, Leif Lindholm, Catalin Marinas,
	Will Deacon, stable, Matt Fleming, Mark Rutland, Mark Salter,
	Linus Torvalds, Andrew Morton, Andy Lutomirski, Denys Vlasenko,
	Brian Gerst


* Borislav Petkov <bp@alien8.de> wrote:

> On Sun, Sep 27, 2015 at 09:06:44AM +0200, Ingo Molnar wrote:
>
> > Could we please re-list all the arguments pro and contra of 1:1 physical 
> > mappings, in a post that also explains the background so that more people can 
> > chime in, not just people versed in EFI internals? It's very much possible 
> > that a bad decision was made.
> 
> The main reason why we did the additional, top-down mapping was kexec kernel 
> wanting to use UEFI runtime facilities too and the braindead design of 
> SetVirtualAddressMap() being callable only once per system boot. So we had to 
> have stable mappings which are valid in the kexec-ed kernel too.
> 
> But this was long time ago and I most certainly have forgotten all the details.
> 
> And now I'm wondering why didn't we do the 1:1 thing and rebuild the exact same 
> EFI pagetable in the kexec-ed kernel? Because when we do an EFI call, we switch 
> to the special pagetable so why didn't we make the kexec-ed kernel rebuild the 
> 1:1 pagetable which it can use for EFI calls...

Yeah.

> Hmm, again, I've forgotten a lot of details so I'm sure Matt will come in and 
> say "No, you can't do that because..."

Would be nice to re-examine all this.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
  2015-09-27  7:06         ` Ingo Molnar
  (?)
@ 2015-09-27 10:40         ` Borislav Petkov
  2015-09-28  6:20           ` Ingo Molnar
                             ` (2 more replies)
  -1 siblings, 3 replies; 34+ messages in thread
From: Borislav Petkov @ 2015-09-27 10:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Ard Biesheuvel, Matt Fleming, Thomas Gleixner, H. Peter Anvin,
	linux-kernel, linux-efi, Leif Lindholm, Catalin Marinas,
	Will Deacon, stable, Matt Fleming, Mark Rutland, Mark Salter,
	Linus Torvalds, Andrew Morton, Andy Lutomirski, Denys Vlasenko,
	Brian Gerst

On Sun, Sep 27, 2015 at 09:06:44AM +0200, Ingo Molnar wrote:
> Could we please re-list all the arguments pro and contra of 1:1 physical mappings, 
> in a post that also explains the background so that more people can chime in, not 
> just people versed in EFI internals? It's very much possible that a bad decision 
> was made.

The main reason why we did the additional, top-down mapping was kexec
kernel wanting to use UEFI runtime facilities too and the braindead
design of SetVirtualAddressMap() being callable only once per system
boot. So we had to have stable mappings which are valid in the kexec-ed
kernel too.

But this was long time ago and I most certainly have forgotten all the
details.

And now I'm wondering why didn't we do the 1:1 thing and rebuild the
exact same EFI pagetable in the kexec-ed kernel? Because when we do
an EFI call, we switch to the special pagetable so why didn't we make
the kexec-ed kernel rebuild the 1:1 pagetable which it can use for EFI
calls...

Hmm, again, I've forgotten a lot of details so I'm sure Matt will come
in and say "No, you can't do that because..."

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
@ 2015-09-27  7:06         ` Ingo Molnar
  0 siblings, 0 replies; 34+ messages in thread
From: Ingo Molnar @ 2015-09-27  7:06 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Matt Fleming, Thomas Gleixner, H. Peter Anvin, linux-kernel,
	linux-efi, Leif Lindholm, Catalin Marinas, Will Deacon, stable,
	Matt Fleming, Mark Rutland, Mark Salter, Linus Torvalds,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Denys Vlasenko,
	Brian Gerst


* Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:

> > If we allocate the EFI runtime as a single virtual memory block then issues 
> > like rounding between sections does not even come up as a problem: we map the 
> > original offsets and sizes byte by byte.
> 
> Well, by that reasoning, we should not call SetVirtualAddressMap() in the first 
> place, and just use the 1:1 mapping UEFI uses natively. This is more than 
> feasible on arm64, and I actually fought hard against using 
> SetVirtualAddressMap() at all, but I was overruled by others. I think this is 
> also trivially possible on X64, since the 1:1 mapping is already active 
> alongside the VA mapping.

Could we please re-list all the arguments pro and contra of 1:1 physical mappings, 
in a post that also explains the background so that more people can chime in, not 
just people versed in EFI internals? It's very much possible that a bad decision 
was made.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
@ 2015-09-27  7:06         ` Ingo Molnar
  0 siblings, 0 replies; 34+ messages in thread
From: Ingo Molnar @ 2015-09-27  7:06 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Matt Fleming, Thomas Gleixner, H. Peter Anvin,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Leif Lindholm, Catalin Marinas,
	Will Deacon, stable-u79uwXL29TY76Z2rM5mHXA, Matt Fleming,
	Mark Rutland, Mark Salter, Linus Torvalds, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Denys Vlasenko, Brian Gerst


* Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org> wrote:

> > If we allocate the EFI runtime as a single virtual memory block then issues 
> > like rounding between sections does not even come up as a problem: we map the 
> > original offsets and sizes byte by byte.
> 
> Well, by that reasoning, we should not call SetVirtualAddressMap() in the first 
> place, and just use the 1:1 mapping UEFI uses natively. This is more than 
> feasible on arm64, and I actually fought hard against using 
> SetVirtualAddressMap() at all, but I was overruled by others. I think this is 
> also trivially possible on X64, since the 1:1 mapping is already active 
> alongside the VA mapping.

Could we please re-list all the arguments pro and contra of 1:1 physical mappings, 
in a post that also explains the background so that more people can chime in, not 
just people versed in EFI internals? It's very much possible that a bad decision 
was made.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
@ 2015-09-26  7:08       ` Ard Biesheuvel
  0 siblings, 0 replies; 34+ messages in thread
From: Ard Biesheuvel @ 2015-09-26  7:08 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Matt Fleming, Thomas Gleixner, H. Peter Anvin, linux-kernel,
	linux-efi, Leif Lindholm, Catalin Marinas, Will Deacon, stable,
	Matt Fleming, Mark Rutland, Mark Salter, Linus Torvalds,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Denys Vlasenko,
	Brian Gerst

On 25 September 2015 at 23:01, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Matt Fleming <matt@codeblueprint.co.uk> wrote:
>
>> From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
>>
>> The new Properties Table feature introduced in UEFIv2.5 may split
>> memory regions that cover PE/COFF memory images into separate code
>> and data regions. Since these regions only differ in the type (runtime
>> code vs runtime data) and the permission bits, but not in the memory
>> type attributes (UC/WC/WT/WB), the spec does not require them to be
>> aligned to 64 KB.
>>
>> Since the relative offset of PE/COFF .text and .data segments cannot
>> be changed on the fly, this means that we can no longer pad out those
>> regions to be mappable using 64 KB pages.
>> Unfortunately, there is no annotation in the UEFI memory map that
>> identifies data regions that were split off from a code region, so we
>> must apply this logic to all adjacent runtime regions whose attributes
>> only differ in the permission bits.
>>
>> So instead of rounding each memory region to 64 KB alignment at both
>> ends, only round down regions that are not directly preceded by another
>> runtime region with the same type attributes. Since the UEFI spec does
>> not mandate that the memory map be sorted, this means we also need to
>> sort it first.
>
> So I think this is fundamentally wrong as well, similarly to the related x86 fix.
>
> I think for compatibility reasons the whole 'EFI runtime image' should be mapped
> in a single go, as closely matching the EFI layouts and offsets as possible. We
> are not talking about gigabytes here, right?
>

As I explained in the other thread, this is really not necessary, and
never has been until the firmware started splitting up PE/COFF images
into several sections each. As long as we keep those PE/COFF images
together, everything will work as before, and the only complication is
that the memory map does not contain any clues about which regions
belong to a single PE/COFF image, so we need to keep all adjacent
EFI_MEMORY_RUNTIME regions adjacent in the VA mapping.

> Even if technically they are 'separate sections', the x86 bug shows that they
> aren't. So we should not pretend so on the Linux side either and we should not
> tear them apart (and then work hard to preserve the interdependencies, some
> declared, some hidden!).
>

This is about relocations and interdependencies at the symbol level,
and such interdependencies only exist internally inside PE/COFF
images.

> If we allocate the EFI runtime as a single virtual memory block then issues like
> rounding between sections does not even come up as a problem: we map the original
> offsets and sizes byte by byte.
>

Well, by that reasoning, we should not call SetVirtualAddressMap() in
the first place, and just use the 1:1 mapping UEFI uses natively. This
is more than feasible on arm64, and I actually fought hard against
using SetVirtualAddressMap() at all, but I was overruled by others. I
think this is also trivially possible on X64, since the 1:1 mapping is
already active alongside the VA mapping.

-- 
Ard.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
@ 2015-09-26  7:08       ` Ard Biesheuvel
  0 siblings, 0 replies; 34+ messages in thread
From: Ard Biesheuvel @ 2015-09-26  7:08 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Matt Fleming, Thomas Gleixner, H. Peter Anvin,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Leif Lindholm, Catalin Marinas,
	Will Deacon, stable-u79uwXL29TY76Z2rM5mHXA, Matt Fleming,
	Mark Rutland, Mark Salter, Linus Torvalds, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Denys Vlasenko, Brian Gerst

On 25 September 2015 at 23:01, Ingo Molnar <mingo-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote:
>
> * Matt Fleming <matt-mF/unelCI9GS6iBeEJttW/XRex20P6io@public.gmane.org> wrote:
>
>> From: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
>>
>> The new Properties Table feature introduced in UEFIv2.5 may split
>> memory regions that cover PE/COFF memory images into separate code
>> and data regions. Since these regions only differ in the type (runtime
>> code vs runtime data) and the permission bits, but not in the memory
>> type attributes (UC/WC/WT/WB), the spec does not require them to be
>> aligned to 64 KB.
>>
>> Since the relative offset of PE/COFF .text and .data segments cannot
>> be changed on the fly, this means that we can no longer pad out those
>> regions to be mappable using 64 KB pages.
>> Unfortunately, there is no annotation in the UEFI memory map that
>> identifies data regions that were split off from a code region, so we
>> must apply this logic to all adjacent runtime regions whose attributes
>> only differ in the permission bits.
>>
>> So instead of rounding each memory region to 64 KB alignment at both
>> ends, only round down regions that are not directly preceded by another
>> runtime region with the same type attributes. Since the UEFI spec does
>> not mandate that the memory map be sorted, this means we also need to
>> sort it first.
>
> So I think this is fundamentally wrong as well, similarly to the related x86 fix.
>
> I think for compatibility reasons the whole 'EFI runtime image' should be mapped
> in a single go, as closely matching the EFI layouts and offsets as possible. We
> are not talking about gigabytes here, right?
>

As I explained in the other thread, this is really not necessary, and
never has been until the firmware started splitting up PE/COFF images
into several sections each. As long as we keep those PE/COFF images
together, everything will work as before, and the only complication is
that the memory map does not contain any clues about which regions
belong to a single PE/COFF image, so we need to keep all adjacent
EFI_MEMORY_RUNTIME regions adjacent in the VA mapping.

> Even if technically they are 'separate sections', the x86 bug shows that they
> aren't. So we should not pretend so on the Linux side either and we should not
> tear them apart (and then work hard to preserve the interdependencies, some
> declared, some hidden!).
>

This is about relocations and interdependencies at the symbol level,
and such interdependencies only exist internally inside PE/COFF
images.

> If we allocate the EFI runtime as a single virtual memory block then issues like
> rounding between sections does not even come up as a problem: we map the original
> offsets and sizes byte by byte.
>

Well, by that reasoning, we should not call SetVirtualAddressMap() in
the first place, and just use the 1:1 mapping UEFI uses natively. This
is more than feasible on arm64, and I actually fought hard against
using SetVirtualAddressMap() at all, but I was overruled by others. I
think this is also trivially possible on X64, since the 1:1 mapping is
already active alongside the VA mapping.

-- 
Ard.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
@ 2015-09-26  6:01     ` Ingo Molnar
  0 siblings, 0 replies; 34+ messages in thread
From: Ingo Molnar @ 2015-09-26  6:01 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Thomas Gleixner, H. Peter Anvin, Ard Biesheuvel, linux-kernel,
	linux-efi, Leif Lindholm, Catalin Marinas, Will Deacon, stable,
	Matt Fleming, Mark Rutland, Mark Salter, Linus Torvalds,
	Andrew Morton, Andy Lutomirski, Borislav Petkov, Denys Vlasenko,
	Brian Gerst


* Matt Fleming <matt@codeblueprint.co.uk> wrote:

> From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> 
> The new Properties Table feature introduced in UEFIv2.5 may split
> memory regions that cover PE/COFF memory images into separate code
> and data regions. Since these regions only differ in the type (runtime
> code vs runtime data) and the permission bits, but not in the memory
> type attributes (UC/WC/WT/WB), the spec does not require them to be
> aligned to 64 KB.
> 
> Since the relative offset of PE/COFF .text and .data segments cannot
> be changed on the fly, this means that we can no longer pad out those
> regions to be mappable using 64 KB pages.
> Unfortunately, there is no annotation in the UEFI memory map that
> identifies data regions that were split off from a code region, so we
> must apply this logic to all adjacent runtime regions whose attributes
> only differ in the permission bits.
> 
> So instead of rounding each memory region to 64 KB alignment at both
> ends, only round down regions that are not directly preceded by another
> runtime region with the same type attributes. Since the UEFI spec does
> not mandate that the memory map be sorted, this means we also need to
> sort it first.

So I think this is fundamentally wrong as well, similarly to the related x86 fix.

I think for compatibility reasons the whole 'EFI runtime image' should be mapped 
in a single go, as closely matching the EFI layouts and offsets as possible. We 
are not talking about gigabytes here, right?

Even if technically they are 'separate sections', the x86 bug shows that they 
aren't. So we should not pretend so on the Linux side either and we should not 
tear them apart (and then work hard to preserve the interdependencies, some 
declared, some hidden!).

If we allocate the EFI runtime as a single virtual memory block then issues like 
rounding between sections does not even come up as a problem: we map the original 
offsets and sizes byte by byte.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
@ 2015-09-26  6:01     ` Ingo Molnar
  0 siblings, 0 replies; 34+ messages in thread
From: Ingo Molnar @ 2015-09-26  6:01 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Thomas Gleixner, H. Peter Anvin, Ard Biesheuvel,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-efi-u79uwXL29TY76Z2rM5mHXA, Leif Lindholm, Catalin Marinas,
	Will Deacon, stable-u79uwXL29TY76Z2rM5mHXA, Matt Fleming,
	Mark Rutland, Mark Salter, Linus Torvalds, Andrew Morton,
	Andy Lutomirski, Borislav Petkov, Denys Vlasenko, Brian Gerst


* Matt Fleming <matt-mF/unelCI9GS6iBeEJttW/XRex20P6io@public.gmane.org> wrote:

> From: Ard Biesheuvel <ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
> 
> The new Properties Table feature introduced in UEFIv2.5 may split
> memory regions that cover PE/COFF memory images into separate code
> and data regions. Since these regions only differ in the type (runtime
> code vs runtime data) and the permission bits, but not in the memory
> type attributes (UC/WC/WT/WB), the spec does not require them to be
> aligned to 64 KB.
> 
> Since the relative offset of PE/COFF .text and .data segments cannot
> be changed on the fly, this means that we can no longer pad out those
> regions to be mappable using 64 KB pages.
> Unfortunately, there is no annotation in the UEFI memory map that
> identifies data regions that were split off from a code region, so we
> must apply this logic to all adjacent runtime regions whose attributes
> only differ in the permission bits.
> 
> So instead of rounding each memory region to 64 KB alignment at both
> ends, only round down regions that are not directly preceded by another
> runtime region with the same type attributes. Since the UEFI spec does
> not mandate that the memory map be sorted, this means we also need to
> sort it first.

So I think this is fundamentally wrong as well, similarly to the related x86 fix.

I think for compatibility reasons the whole 'EFI runtime image' should be mapped 
in a single go, as closely matching the EFI layouts and offsets as possible. We 
are not talking about gigabytes here, right?

Even if technically they are 'separate sections', the x86 bug shows that they 
aren't. So we should not pretend so on the Linux side either and we should not 
tear them apart (and then work hard to preserve the interdependencies, some 
declared, some hidden!).

If we allocate the EFI runtime as a single virtual memory block then issues like 
rounding between sections does not even come up as a problem: we map the original 
offsets and sizes byte by byte.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
  2015-09-25 22:02 [GIT PULL 0/2] EFI urgent fixes Matt Fleming
@ 2015-09-25 22:02   ` Matt Fleming
  0 siblings, 0 replies; 34+ messages in thread
From: Matt Fleming @ 2015-09-25 22:02 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin
  Cc: Ard Biesheuvel, linux-kernel, linux-efi, Leif Lindholm,
	Catalin Marinas, Will Deacon, stable, Matt Fleming, Mark Rutland,
	Mark Salter

From: Ard Biesheuvel <ard.biesheuvel@linaro.org>

The new Properties Table feature introduced in UEFIv2.5 may split
memory regions that cover PE/COFF memory images into separate code
and data regions. Since these regions only differ in the type (runtime
code vs runtime data) and the permission bits, but not in the memory
type attributes (UC/WC/WT/WB), the spec does not require them to be
aligned to 64 KB.

Since the relative offset of PE/COFF .text and .data segments cannot
be changed on the fly, this means that we can no longer pad out those
regions to be mappable using 64 KB pages.
Unfortunately, there is no annotation in the UEFI memory map that
identifies data regions that were split off from a code region, so we
must apply this logic to all adjacent runtime regions whose attributes
only differ in the permission bits.

So instead of rounding each memory region to 64 KB alignment at both
ends, only round down regions that are not directly preceded by another
runtime region with the same type attributes. Since the UEFI spec does
not mandate that the memory map be sorted, this means we also need to
sort it first.

Note that this change will result in all EFI_MEMORY_RUNTIME regions whose
start addresses are not aligned to the OS page size to be mapped with
executable permissions (i.e., on kernels compiled with 64 KB pages).
However, since these mappings are only active during the time that UEFI
Runtime Services are being invoked, the window for abuse is rather small.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Mark Salter <msalter@redhat.com>
Reviewed-by: Mark Salter <msalter@redhat.com>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com> [UEFI 2.4 only]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org> # v4.0+
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
---
 arch/arm64/kernel/efi.c                 |  3 +-
 drivers/firmware/efi/libstub/arm-stub.c | 88 +++++++++++++++++++++++++++------
 2 files changed, 75 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index e8ca6eaedd02..13671a9cf016 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -258,7 +258,8 @@ static bool __init efi_virtmap_init(void)
 		 */
 		if (!is_normal_ram(md))
 			prot = __pgprot(PROT_DEVICE_nGnRE);
-		else if (md->type == EFI_RUNTIME_SERVICES_CODE)
+		else if (md->type == EFI_RUNTIME_SERVICES_CODE ||
+			 !PAGE_ALIGNED(md->phys_addr))
 			prot = PAGE_KERNEL_EXEC;
 		else
 			prot = PAGE_KERNEL;
diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
index e29560e6b40b..950c87f5d279 100644
--- a/drivers/firmware/efi/libstub/arm-stub.c
+++ b/drivers/firmware/efi/libstub/arm-stub.c
@@ -13,6 +13,7 @@
  */
 
 #include <linux/efi.h>
+#include <linux/sort.h>
 #include <asm/efi.h>
 
 #include "efistub.h"
@@ -305,6 +306,44 @@ fail:
  */
 #define EFI_RT_VIRTUAL_BASE	0x40000000
 
+static int cmp_mem_desc(const void *l, const void *r)
+{
+	const efi_memory_desc_t *left = l, *right = r;
+
+	return (left->phys_addr > right->phys_addr) ? 1 : -1;
+}
+
+/*
+ * Returns whether region @left ends exactly where region @right starts,
+ * or false if either argument is NULL.
+ */
+static bool regions_are_adjacent(efi_memory_desc_t *left,
+				 efi_memory_desc_t *right)
+{
+	u64 left_end;
+
+	if (left == NULL || right == NULL)
+		return false;
+
+	left_end = left->phys_addr + left->num_pages * EFI_PAGE_SIZE;
+
+	return left_end == right->phys_addr;
+}
+
+/*
+ * Returns whether region @left and region @right have compatible memory type
+ * mapping attributes, and are both EFI_MEMORY_RUNTIME regions.
+ */
+static bool regions_have_compatible_memory_type_attrs(efi_memory_desc_t *left,
+						      efi_memory_desc_t *right)
+{
+	static const u64 mem_type_mask = EFI_MEMORY_WB | EFI_MEMORY_WT |
+					 EFI_MEMORY_WC | EFI_MEMORY_UC |
+					 EFI_MEMORY_RUNTIME;
+
+	return ((left->attribute ^ right->attribute) & mem_type_mask) == 0;
+}
+
 /*
  * efi_get_virtmap() - create a virtual mapping for the EFI memory map
  *
@@ -317,33 +356,52 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
 		     int *count)
 {
 	u64 efi_virt_base = EFI_RT_VIRTUAL_BASE;
-	efi_memory_desc_t *out = runtime_map;
+	efi_memory_desc_t *in, *prev = NULL, *out = runtime_map;
 	int l;
 
-	for (l = 0; l < map_size; l += desc_size) {
-		efi_memory_desc_t *in = (void *)memory_map + l;
+	/*
+	 * To work around potential issues with the Properties Table feature
+	 * introduced in UEFI 2.5, which may split PE/COFF executable images
+	 * in memory into several RuntimeServicesCode and RuntimeServicesData
+	 * regions, we need to preserve the relative offsets between adjacent
+	 * EFI_MEMORY_RUNTIME regions with the same memory type attributes.
+	 * The easiest way to find adjacent regions is to sort the memory map
+	 * before traversing it.
+	 */
+	sort(memory_map, map_size / desc_size, desc_size, cmp_mem_desc, NULL);
+
+	for (l = 0; l < map_size; l += desc_size, prev = in) {
 		u64 paddr, size;
 
+		in = (void *)memory_map + l;
 		if (!(in->attribute & EFI_MEMORY_RUNTIME))
 			continue;
 
+		paddr = in->phys_addr;
+		size = in->num_pages * EFI_PAGE_SIZE;
+
 		/*
 		 * Make the mapping compatible with 64k pages: this allows
 		 * a 4k page size kernel to kexec a 64k page size kernel and
 		 * vice versa.
 		 */
-		paddr = round_down(in->phys_addr, SZ_64K);
-		size = round_up(in->num_pages * EFI_PAGE_SIZE +
-				in->phys_addr - paddr, SZ_64K);
-
-		/*
-		 * Avoid wasting memory on PTEs by choosing a virtual base that
-		 * is compatible with section mappings if this region has the
-		 * appropriate size and physical alignment. (Sections are 2 MB
-		 * on 4k granule kernels)
-		 */
-		if (IS_ALIGNED(in->phys_addr, SZ_2M) && size >= SZ_2M)
-			efi_virt_base = round_up(efi_virt_base, SZ_2M);
+		if (!regions_are_adjacent(prev, in) ||
+		    !regions_have_compatible_memory_type_attrs(prev, in)) {
+
+			paddr = round_down(in->phys_addr, SZ_64K);
+			size += in->phys_addr - paddr;
+
+			/*
+			 * Avoid wasting memory on PTEs by choosing a virtual
+			 * base that is compatible with section mappings if this
+			 * region has the appropriate size and physical
+			 * alignment. (Sections are 2 MB on 4k granule kernels)
+			 */
+			if (IS_ALIGNED(in->phys_addr, SZ_2M) && size >= SZ_2M)
+				efi_virt_base = round_up(efi_virt_base, SZ_2M);
+			else
+				efi_virt_base = round_up(efi_virt_base, SZ_64K);
+		}
 
 		in->virt_addr = efi_virt_base + in->phys_addr - paddr;
 		efi_virt_base += size;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions
@ 2015-09-25 22:02   ` Matt Fleming
  0 siblings, 0 replies; 34+ messages in thread
From: Matt Fleming @ 2015-09-25 22:02 UTC (permalink / raw)
  To: Ingo Molnar, Thomas Gleixner, H. Peter Anvin
  Cc: Ard Biesheuvel, linux-kernel, linux-efi, Leif Lindholm,
	Catalin Marinas, Will Deacon, stable, Matt Fleming, Mark Rutland,
	Mark Salter

From: Ard Biesheuvel <ard.biesheuvel@linaro.org>

The new Properties Table feature introduced in UEFIv2.5 may split
memory regions that cover PE/COFF memory images into separate code
and data regions. Since these regions only differ in the type (runtime
code vs runtime data) and the permission bits, but not in the memory
type attributes (UC/WC/WT/WB), the spec does not require them to be
aligned to 64 KB.

Since the relative offset of PE/COFF .text and .data segments cannot
be changed on the fly, this means that we can no longer pad out those
regions to be mappable using 64 KB pages.
Unfortunately, there is no annotation in the UEFI memory map that
identifies data regions that were split off from a code region, so we
must apply this logic to all adjacent runtime regions whose attributes
only differ in the permission bits.

So instead of rounding each memory region to 64 KB alignment at both
ends, only round down regions that are not directly preceded by another
runtime region with the same type attributes. Since the UEFI spec does
not mandate that the memory map be sorted, this means we also need to
sort it first.

Note that this change will result in all EFI_MEMORY_RUNTIME regions whose
start addresses are not aligned to the OS page size to be mapped with
executable permissions (i.e., on kernels compiled with 64 KB pages).
However, since these mappings are only active during the time that UEFI
Runtime Services are being invoked, the window for abuse is rather small.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Mark Salter <msalter@redhat.com>
Reviewed-by: Mark Salter <msalter@redhat.com>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com> [UEFI 2.4 only]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org> # v4.0+
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
---
 arch/arm64/kernel/efi.c                 |  3 +-
 drivers/firmware/efi/libstub/arm-stub.c | 88 +++++++++++++++++++++++++++------
 2 files changed, 75 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index e8ca6eaedd02..13671a9cf016 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -258,7 +258,8 @@ static bool __init efi_virtmap_init(void)
 		 */
 		if (!is_normal_ram(md))
 			prot = __pgprot(PROT_DEVICE_nGnRE);
-		else if (md->type == EFI_RUNTIME_SERVICES_CODE)
+		else if (md->type == EFI_RUNTIME_SERVICES_CODE ||
+			 !PAGE_ALIGNED(md->phys_addr))
 			prot = PAGE_KERNEL_EXEC;
 		else
 			prot = PAGE_KERNEL;
diff --git a/drivers/firmware/efi/libstub/arm-stub.c b/drivers/firmware/efi/libstub/arm-stub.c
index e29560e6b40b..950c87f5d279 100644
--- a/drivers/firmware/efi/libstub/arm-stub.c
+++ b/drivers/firmware/efi/libstub/arm-stub.c
@@ -13,6 +13,7 @@
  */
 
 #include <linux/efi.h>
+#include <linux/sort.h>
 #include <asm/efi.h>
 
 #include "efistub.h"
@@ -305,6 +306,44 @@ fail:
  */
 #define EFI_RT_VIRTUAL_BASE	0x40000000
 
+static int cmp_mem_desc(const void *l, const void *r)
+{
+	const efi_memory_desc_t *left = l, *right = r;
+
+	return (left->phys_addr > right->phys_addr) ? 1 : -1;
+}
+
+/*
+ * Returns whether region @left ends exactly where region @right starts,
+ * or false if either argument is NULL.
+ */
+static bool regions_are_adjacent(efi_memory_desc_t *left,
+				 efi_memory_desc_t *right)
+{
+	u64 left_end;
+
+	if (left == NULL || right == NULL)
+		return false;
+
+	left_end = left->phys_addr + left->num_pages * EFI_PAGE_SIZE;
+
+	return left_end == right->phys_addr;
+}
+
+/*
+ * Returns whether region @left and region @right have compatible memory type
+ * mapping attributes, and are both EFI_MEMORY_RUNTIME regions.
+ */
+static bool regions_have_compatible_memory_type_attrs(efi_memory_desc_t *left,
+						      efi_memory_desc_t *right)
+{
+	static const u64 mem_type_mask = EFI_MEMORY_WB | EFI_MEMORY_WT |
+					 EFI_MEMORY_WC | EFI_MEMORY_UC |
+					 EFI_MEMORY_RUNTIME;
+
+	return ((left->attribute ^ right->attribute) & mem_type_mask) == 0;
+}
+
 /*
  * efi_get_virtmap() - create a virtual mapping for the EFI memory map
  *
@@ -317,33 +356,52 @@ void efi_get_virtmap(efi_memory_desc_t *memory_map, unsigned long map_size,
 		     int *count)
 {
 	u64 efi_virt_base = EFI_RT_VIRTUAL_BASE;
-	efi_memory_desc_t *out = runtime_map;
+	efi_memory_desc_t *in, *prev = NULL, *out = runtime_map;
 	int l;
 
-	for (l = 0; l < map_size; l += desc_size) {
-		efi_memory_desc_t *in = (void *)memory_map + l;
+	/*
+	 * To work around potential issues with the Properties Table feature
+	 * introduced in UEFI 2.5, which may split PE/COFF executable images
+	 * in memory into several RuntimeServicesCode and RuntimeServicesData
+	 * regions, we need to preserve the relative offsets between adjacent
+	 * EFI_MEMORY_RUNTIME regions with the same memory type attributes.
+	 * The easiest way to find adjacent regions is to sort the memory map
+	 * before traversing it.
+	 */
+	sort(memory_map, map_size / desc_size, desc_size, cmp_mem_desc, NULL);
+
+	for (l = 0; l < map_size; l += desc_size, prev = in) {
 		u64 paddr, size;
 
+		in = (void *)memory_map + l;
 		if (!(in->attribute & EFI_MEMORY_RUNTIME))
 			continue;
 
+		paddr = in->phys_addr;
+		size = in->num_pages * EFI_PAGE_SIZE;
+
 		/*
 		 * Make the mapping compatible with 64k pages: this allows
 		 * a 4k page size kernel to kexec a 64k page size kernel and
 		 * vice versa.
 		 */
-		paddr = round_down(in->phys_addr, SZ_64K);
-		size = round_up(in->num_pages * EFI_PAGE_SIZE +
-				in->phys_addr - paddr, SZ_64K);
-
-		/*
-		 * Avoid wasting memory on PTEs by choosing a virtual base that
-		 * is compatible with section mappings if this region has the
-		 * appropriate size and physical alignment. (Sections are 2 MB
-		 * on 4k granule kernels)
-		 */
-		if (IS_ALIGNED(in->phys_addr, SZ_2M) && size >= SZ_2M)
-			efi_virt_base = round_up(efi_virt_base, SZ_2M);
+		if (!regions_are_adjacent(prev, in) ||
+		    !regions_have_compatible_memory_type_attrs(prev, in)) {
+
+			paddr = round_down(in->phys_addr, SZ_64K);
+			size += in->phys_addr - paddr;
+
+			/*
+			 * Avoid wasting memory on PTEs by choosing a virtual
+			 * base that is compatible with section mappings if this
+			 * region has the appropriate size and physical
+			 * alignment. (Sections are 2 MB on 4k granule kernels)
+			 */
+			if (IS_ALIGNED(in->phys_addr, SZ_2M) && size >= SZ_2M)
+				efi_virt_base = round_up(efi_virt_base, SZ_2M);
+			else
+				efi_virt_base = round_up(efi_virt_base, SZ_64K);
+		}
 
 		in->virt_addr = efi_virt_base + in->phys_addr - paddr;
 		efi_virt_base += size;
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2015-10-01 10:45 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-30 10:17 [PATCH 0/2] arm64/efi: adapt to UEFI 2.5 properties table changes Ard Biesheuvel
2015-06-30 10:17 ` Ard Biesheuvel
     [not found] ` <1435659443-17625-1-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2015-06-30 10:17   ` [PATCH 1/2] arm64/efi: base UEFI mapping permissions on region attributes Ard Biesheuvel
2015-06-30 10:17     ` Ard Biesheuvel
     [not found]     ` <1435659443-17625-2-git-send-email-ard.biesheuvel-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
2015-06-30 14:50       ` Mark Salter
2015-06-30 14:50         ` Mark Salter
     [not found]         ` <1435675848.21009.10.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-06-30 14:53           ` Ard Biesheuvel
2015-06-30 14:53             ` Ard Biesheuvel
2015-06-30 10:17   ` [PATCH 2/2] arm64/efi: don't pad between EFI_MEMORY_RUNTIME regions Ard Biesheuvel
2015-06-30 10:17     ` Ard Biesheuvel
2015-09-25 22:02 [GIT PULL 0/2] EFI urgent fixes Matt Fleming
2015-09-25 22:02 ` [PATCH 2/2] arm64/efi: Don't pad between EFI_MEMORY_RUNTIME regions Matt Fleming
2015-09-25 22:02   ` Matt Fleming
2015-09-26  6:01   ` Ingo Molnar
2015-09-26  6:01     ` Ingo Molnar
2015-09-26  7:08     ` Ard Biesheuvel
2015-09-26  7:08       ` Ard Biesheuvel
2015-09-27  7:06       ` Ingo Molnar
2015-09-27  7:06         ` Ingo Molnar
2015-09-27 10:40         ` Borislav Petkov
2015-09-28  6:20           ` Ingo Molnar
2015-09-29  9:31           ` Dave Young
2015-09-29 10:24             ` Borislav Petkov
2015-09-29 14:36           ` Matt Fleming
2015-09-29 14:36             ` Matt Fleming
2015-09-30  0:56             ` H. Peter Anvin
2015-09-30  0:56               ` H. Peter Anvin
2015-09-30  8:33               ` Borislav Petkov
2015-09-30  8:33                 ` Borislav Petkov
2015-09-30  1:03         ` H. Peter Anvin
2015-09-30  1:16           ` Andy Lutomirski
2015-09-30  1:19             ` H. Peter Anvin
2015-09-30  4:24             ` Ard Biesheuvel
2015-09-30  4:24               ` Ard Biesheuvel
2015-10-01 10:44           ` Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.