All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations
@ 2021-05-31  9:57 ` Marc Zyngier
  0 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-05-31  9:57 UTC (permalink / raw)
  To: kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	James Morse, Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla,
	Eric Biederman, Bhupesh SHARMA, AKASHI Takahiro, Dave Young,
	Andrew Morton, Moritz Fischer, kernel-team

This series is a complete departure from the approach I initially sent
almost a month ago[0]. Instead of trying to teach EFI, ACPI and other
subsystem to use memblock, I've decided to stick with the iomem
resource tree and use that exclusively for arm64.

This means that my current approach is (despite what I initially
replied to both Dave and Catalin) to provide an arm64-specific
implementation of arch_kexec_locate_mem_hole() which walks the
resource tree and excludes ranges of RAM that have been registered for
any odd purpose. This is exactly what the userspace implementation
does, and I don't really see a good reason to diverge from it.

Again, this allows my Synquacer board to reliably use kexec_file_load
with as little as 256M, something that would always fail before as it
would overwrite most of the reserved tables.

Although this series still targets 5.14, the initial patch is a
-stable candidate, and disables non-kdump uses of kexec_file_load. I
have limited it to 5.10, as earlier kernels will require a different,
probably more invasive approach.

Catalin, Ard: although this series has changed a bit compared to v1,
I've kept your AB/RB tags. Should anything seem odd, please let me
know and I'll drop them.

Thanks,

	M.

* From v1 [1]:
  - Move the overlap exclusion into find_next_iomem_res()
  - Handle child resource not overlapping with parent
  - Provide walk_system_ram_excluding_child_res() as a top level
    walker
  - Simplify arch-specific code
  - Add initial patch disabling non-crash kernels

[0] https://lore.kernel.org/r/20210429133533.1750721-1-maz@kernel.org
[1] https://lore.kernel.org/r/20210526190531.62751-1-maz@kernel.org

Marc Zyngier (5):
  arm64: kexec_file: Forbid non-crash kernels
  kexec_file: Make locate_mem_hole_callback global
  kernel/resource: Allow find_next_iomem_res() to exclude overlapping
    child resources
  kernel/resource: Introduce walk_system_ram_excluding_child_res()
  arm64: kexec_image: Restore full kexec functionnality

 arch/arm64/kernel/kexec_image.c | 39 ++++++++++++++++
 include/linux/ioport.h          |  3 ++
 include/linux/kexec.h           |  1 +
 kernel/kexec_file.c             |  6 +--
 kernel/resource.c               | 82 +++++++++++++++++++++++++++++----
 5 files changed, 119 insertions(+), 12 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations
@ 2021-05-31  9:57 ` Marc Zyngier
  0 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-05-31  9:57 UTC (permalink / raw)
  To: kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	James Morse, Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla,
	Eric Biederman, Bhupesh SHARMA, AKASHI Takahiro, Dave Young,
	Andrew Morton, Moritz Fischer, kernel-team

This series is a complete departure from the approach I initially sent
almost a month ago[0]. Instead of trying to teach EFI, ACPI and other
subsystem to use memblock, I've decided to stick with the iomem
resource tree and use that exclusively for arm64.

This means that my current approach is (despite what I initially
replied to both Dave and Catalin) to provide an arm64-specific
implementation of arch_kexec_locate_mem_hole() which walks the
resource tree and excludes ranges of RAM that have been registered for
any odd purpose. This is exactly what the userspace implementation
does, and I don't really see a good reason to diverge from it.

Again, this allows my Synquacer board to reliably use kexec_file_load
with as little as 256M, something that would always fail before as it
would overwrite most of the reserved tables.

Although this series still targets 5.14, the initial patch is a
-stable candidate, and disables non-kdump uses of kexec_file_load. I
have limited it to 5.10, as earlier kernels will require a different,
probably more invasive approach.

Catalin, Ard: although this series has changed a bit compared to v1,
I've kept your AB/RB tags. Should anything seem odd, please let me
know and I'll drop them.

Thanks,

	M.

* From v1 [1]:
  - Move the overlap exclusion into find_next_iomem_res()
  - Handle child resource not overlapping with parent
  - Provide walk_system_ram_excluding_child_res() as a top level
    walker
  - Simplify arch-specific code
  - Add initial patch disabling non-crash kernels

[0] https://lore.kernel.org/r/20210429133533.1750721-1-maz@kernel.org
[1] https://lore.kernel.org/r/20210526190531.62751-1-maz@kernel.org

Marc Zyngier (5):
  arm64: kexec_file: Forbid non-crash kernels
  kexec_file: Make locate_mem_hole_callback global
  kernel/resource: Allow find_next_iomem_res() to exclude overlapping
    child resources
  kernel/resource: Introduce walk_system_ram_excluding_child_res()
  arm64: kexec_image: Restore full kexec functionnality

 arch/arm64/kernel/kexec_image.c | 39 ++++++++++++++++
 include/linux/ioport.h          |  3 ++
 include/linux/kexec.h           |  1 +
 kernel/kexec_file.c             |  6 +--
 kernel/resource.c               | 82 +++++++++++++++++++++++++++++----
 5 files changed, 119 insertions(+), 12 deletions(-)

-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations
@ 2021-05-31  9:57 ` Marc Zyngier
  0 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-05-31  9:57 UTC (permalink / raw)
  To: kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	James Morse, Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla,
	Eric Biederman, Bhupesh SHARMA, AKASHI Takahiro, Dave Young,
	Andrew Morton, Moritz Fischer, kernel-team

This series is a complete departure from the approach I initially sent
almost a month ago[0]. Instead of trying to teach EFI, ACPI and other
subsystem to use memblock, I've decided to stick with the iomem
resource tree and use that exclusively for arm64.

This means that my current approach is (despite what I initially
replied to both Dave and Catalin) to provide an arm64-specific
implementation of arch_kexec_locate_mem_hole() which walks the
resource tree and excludes ranges of RAM that have been registered for
any odd purpose. This is exactly what the userspace implementation
does, and I don't really see a good reason to diverge from it.

Again, this allows my Synquacer board to reliably use kexec_file_load
with as little as 256M, something that would always fail before as it
would overwrite most of the reserved tables.

Although this series still targets 5.14, the initial patch is a
-stable candidate, and disables non-kdump uses of kexec_file_load. I
have limited it to 5.10, as earlier kernels will require a different,
probably more invasive approach.

Catalin, Ard: although this series has changed a bit compared to v1,
I've kept your AB/RB tags. Should anything seem odd, please let me
know and I'll drop them.

Thanks,

	M.

* From v1 [1]:
  - Move the overlap exclusion into find_next_iomem_res()
  - Handle child resource not overlapping with parent
  - Provide walk_system_ram_excluding_child_res() as a top level
    walker
  - Simplify arch-specific code
  - Add initial patch disabling non-crash kernels

[0] https://lore.kernel.org/r/20210429133533.1750721-1-maz@kernel.org
[1] https://lore.kernel.org/r/20210526190531.62751-1-maz@kernel.org

Marc Zyngier (5):
  arm64: kexec_file: Forbid non-crash kernels
  kexec_file: Make locate_mem_hole_callback global
  kernel/resource: Allow find_next_iomem_res() to exclude overlapping
    child resources
  kernel/resource: Introduce walk_system_ram_excluding_child_res()
  arm64: kexec_image: Restore full kexec functionnality

 arch/arm64/kernel/kexec_image.c | 39 ++++++++++++++++
 include/linux/ioport.h          |  3 ++
 include/linux/kexec.h           |  1 +
 kernel/kexec_file.c             |  6 +--
 kernel/resource.c               | 82 +++++++++++++++++++++++++++++----
 5 files changed, 119 insertions(+), 12 deletions(-)

-- 
2.30.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v2 1/5] arm64: kexec_file: Forbid non-crash kernels
  2021-05-31  9:57 ` Marc Zyngier
  (?)
@ 2021-05-31  9:57   ` Marc Zyngier
  -1 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-05-31  9:57 UTC (permalink / raw)
  To: kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	James Morse, Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla,
	Eric Biederman, Bhupesh SHARMA, AKASHI Takahiro, Dave Young,
	Andrew Morton, Moritz Fischer, kernel-team, stable

It has been reported that kexec_file doesn't really work on arm64.
It completely ignores any of the existing reservations, which results
in the secondary kernel being loaded where the GICv3 LPI tables live,
or even corrupting the ACPI tables.

Since only crash kernels are imune to this as they use a reserved
memory region, disable the non-crash kernel use case. Further
patches will try and restore the functionality.

Reported-by: Moritz Fischer <mdf@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org # 5.10
---
 arch/arm64/kernel/kexec_image.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
index 9ec34690e255..acf9cd251307 100644
--- a/arch/arm64/kernel/kexec_image.c
+++ b/arch/arm64/kernel/kexec_image.c
@@ -145,3 +145,23 @@ const struct kexec_file_ops kexec_image_ops = {
 	.verify_sig = image_verify_sig,
 #endif
 };
+
+/**
+ * arch_kexec_locate_mem_hole - Find free memory to place the segments.
+ * @kbuf:                       Parameters for the memory search.
+ *
+ * On success, kbuf->mem will have the start address of the memory region found.
+ *
+ * Return: 0 on success, negative errno on error.
+ */
+int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf)
+{
+	/*
+	 * For the time being, kexec_file_load isn't reliable except
+	 * for crash kernel. Say sorry to the user.
+	 */
+	if (kbuf->image->type != KEXEC_TYPE_CRASH)
+		return -EADDRNOTAVAIL;
+
+	return kexec_locate_mem_hole(kbuf);
+}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 1/5] arm64: kexec_file: Forbid non-crash kernels
@ 2021-05-31  9:57   ` Marc Zyngier
  0 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-05-31  9:57 UTC (permalink / raw)
  To: kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	James Morse, Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla,
	Eric Biederman, Bhupesh SHARMA, AKASHI Takahiro, Dave Young,
	Andrew Morton, Moritz Fischer, kernel-team, stable

It has been reported that kexec_file doesn't really work on arm64.
It completely ignores any of the existing reservations, which results
in the secondary kernel being loaded where the GICv3 LPI tables live,
or even corrupting the ACPI tables.

Since only crash kernels are imune to this as they use a reserved
memory region, disable the non-crash kernel use case. Further
patches will try and restore the functionality.

Reported-by: Moritz Fischer <mdf@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org # 5.10
---
 arch/arm64/kernel/kexec_image.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
index 9ec34690e255..acf9cd251307 100644
--- a/arch/arm64/kernel/kexec_image.c
+++ b/arch/arm64/kernel/kexec_image.c
@@ -145,3 +145,23 @@ const struct kexec_file_ops kexec_image_ops = {
 	.verify_sig = image_verify_sig,
 #endif
 };
+
+/**
+ * arch_kexec_locate_mem_hole - Find free memory to place the segments.
+ * @kbuf:                       Parameters for the memory search.
+ *
+ * On success, kbuf->mem will have the start address of the memory region found.
+ *
+ * Return: 0 on success, negative errno on error.
+ */
+int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf)
+{
+	/*
+	 * For the time being, kexec_file_load isn't reliable except
+	 * for crash kernel. Say sorry to the user.
+	 */
+	if (kbuf->image->type != KEXEC_TYPE_CRASH)
+		return -EADDRNOTAVAIL;
+
+	return kexec_locate_mem_hole(kbuf);
+}
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 1/5] arm64: kexec_file: Forbid non-crash kernels
@ 2021-05-31  9:57   ` Marc Zyngier
  0 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-05-31  9:57 UTC (permalink / raw)
  To: kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	James Morse, Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla,
	Eric Biederman, Bhupesh SHARMA, AKASHI Takahiro, Dave Young,
	Andrew Morton, Moritz Fischer, kernel-team, stable

It has been reported that kexec_file doesn't really work on arm64.
It completely ignores any of the existing reservations, which results
in the secondary kernel being loaded where the GICv3 LPI tables live,
or even corrupting the ACPI tables.

Since only crash kernels are imune to this as they use a reserved
memory region, disable the non-crash kernel use case. Further
patches will try and restore the functionality.

Reported-by: Moritz Fischer <mdf@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org # 5.10
---
 arch/arm64/kernel/kexec_image.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
index 9ec34690e255..acf9cd251307 100644
--- a/arch/arm64/kernel/kexec_image.c
+++ b/arch/arm64/kernel/kexec_image.c
@@ -145,3 +145,23 @@ const struct kexec_file_ops kexec_image_ops = {
 	.verify_sig = image_verify_sig,
 #endif
 };
+
+/**
+ * arch_kexec_locate_mem_hole - Find free memory to place the segments.
+ * @kbuf:                       Parameters for the memory search.
+ *
+ * On success, kbuf->mem will have the start address of the memory region found.
+ *
+ * Return: 0 on success, negative errno on error.
+ */
+int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf)
+{
+	/*
+	 * For the time being, kexec_file_load isn't reliable except
+	 * for crash kernel. Say sorry to the user.
+	 */
+	if (kbuf->image->type != KEXEC_TYPE_CRASH)
+		return -EADDRNOTAVAIL;
+
+	return kexec_locate_mem_hole(kbuf);
+}
-- 
2.30.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 2/5] kexec_file: Make locate_mem_hole_callback global
  2021-05-31  9:57 ` Marc Zyngier
  (?)
@ 2021-05-31  9:57   ` Marc Zyngier
  -1 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-05-31  9:57 UTC (permalink / raw)
  To: kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	James Morse, Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla,
	Eric Biederman, Bhupesh SHARMA, AKASHI Takahiro, Dave Young,
	Andrew Morton, Moritz Fischer, kernel-team

In order for architectures to make use of locate_mem_hole_callback()
and avoid reinventing a square wheel, make this function global
and rename it to kexec_locate_mem_hole_callback() to match the
other global kexec symbols.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 include/linux/kexec.h | 1 +
 kernel/kexec_file.c   | 6 +++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 0c994ae37729..4b507efdb623 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -204,6 +204,7 @@ int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf);
 
 extern int kexec_add_buffer(struct kexec_buf *kbuf);
 int kexec_locate_mem_hole(struct kexec_buf *kbuf);
+int kexec_locate_mem_hole_callback(struct resource *res, void *arg);
 
 /* Alignment required for elf header segment */
 #define ELF_CORE_HEADER_ALIGN   4096
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 33400ff051a8..960aefc4501d 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -517,7 +517,7 @@ static int locate_mem_hole_bottom_up(unsigned long start, unsigned long end,
 	return 1;
 }
 
-static int locate_mem_hole_callback(struct resource *res, void *arg)
+int kexec_locate_mem_hole_callback(struct resource *res, void *arg)
 {
 	struct kexec_buf *kbuf = (struct kexec_buf *)arg;
 	u64 start = res->start, end = res->end;
@@ -634,9 +634,9 @@ int kexec_locate_mem_hole(struct kexec_buf *kbuf)
 		return 0;
 
 	if (!IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK))
-		ret = kexec_walk_resources(kbuf, locate_mem_hole_callback);
+		ret = kexec_walk_resources(kbuf, kexec_locate_mem_hole_callback);
 	else
-		ret = kexec_walk_memblock(kbuf, locate_mem_hole_callback);
+		ret = kexec_walk_memblock(kbuf, kexec_locate_mem_hole_callback);
 
 	return ret == 1 ? 0 : -EADDRNOTAVAIL;
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 2/5] kexec_file: Make locate_mem_hole_callback global
@ 2021-05-31  9:57   ` Marc Zyngier
  0 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-05-31  9:57 UTC (permalink / raw)
  To: kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	James Morse, Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla,
	Eric Biederman, Bhupesh SHARMA, AKASHI Takahiro, Dave Young,
	Andrew Morton, Moritz Fischer, kernel-team

In order for architectures to make use of locate_mem_hole_callback()
and avoid reinventing a square wheel, make this function global
and rename it to kexec_locate_mem_hole_callback() to match the
other global kexec symbols.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 include/linux/kexec.h | 1 +
 kernel/kexec_file.c   | 6 +++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 0c994ae37729..4b507efdb623 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -204,6 +204,7 @@ int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf);
 
 extern int kexec_add_buffer(struct kexec_buf *kbuf);
 int kexec_locate_mem_hole(struct kexec_buf *kbuf);
+int kexec_locate_mem_hole_callback(struct resource *res, void *arg);
 
 /* Alignment required for elf header segment */
 #define ELF_CORE_HEADER_ALIGN   4096
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 33400ff051a8..960aefc4501d 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -517,7 +517,7 @@ static int locate_mem_hole_bottom_up(unsigned long start, unsigned long end,
 	return 1;
 }
 
-static int locate_mem_hole_callback(struct resource *res, void *arg)
+int kexec_locate_mem_hole_callback(struct resource *res, void *arg)
 {
 	struct kexec_buf *kbuf = (struct kexec_buf *)arg;
 	u64 start = res->start, end = res->end;
@@ -634,9 +634,9 @@ int kexec_locate_mem_hole(struct kexec_buf *kbuf)
 		return 0;
 
 	if (!IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK))
-		ret = kexec_walk_resources(kbuf, locate_mem_hole_callback);
+		ret = kexec_walk_resources(kbuf, kexec_locate_mem_hole_callback);
 	else
-		ret = kexec_walk_memblock(kbuf, locate_mem_hole_callback);
+		ret = kexec_walk_memblock(kbuf, kexec_locate_mem_hole_callback);
 
 	return ret == 1 ? 0 : -EADDRNOTAVAIL;
 }
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 2/5] kexec_file: Make locate_mem_hole_callback global
@ 2021-05-31  9:57   ` Marc Zyngier
  0 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-05-31  9:57 UTC (permalink / raw)
  To: kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	James Morse, Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla,
	Eric Biederman, Bhupesh SHARMA, AKASHI Takahiro, Dave Young,
	Andrew Morton, Moritz Fischer, kernel-team

In order for architectures to make use of locate_mem_hole_callback()
and avoid reinventing a square wheel, make this function global
and rename it to kexec_locate_mem_hole_callback() to match the
other global kexec symbols.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 include/linux/kexec.h | 1 +
 kernel/kexec_file.c   | 6 +++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 0c994ae37729..4b507efdb623 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -204,6 +204,7 @@ int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf);
 
 extern int kexec_add_buffer(struct kexec_buf *kbuf);
 int kexec_locate_mem_hole(struct kexec_buf *kbuf);
+int kexec_locate_mem_hole_callback(struct resource *res, void *arg);
 
 /* Alignment required for elf header segment */
 #define ELF_CORE_HEADER_ALIGN   4096
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 33400ff051a8..960aefc4501d 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -517,7 +517,7 @@ static int locate_mem_hole_bottom_up(unsigned long start, unsigned long end,
 	return 1;
 }
 
-static int locate_mem_hole_callback(struct resource *res, void *arg)
+int kexec_locate_mem_hole_callback(struct resource *res, void *arg)
 {
 	struct kexec_buf *kbuf = (struct kexec_buf *)arg;
 	u64 start = res->start, end = res->end;
@@ -634,9 +634,9 @@ int kexec_locate_mem_hole(struct kexec_buf *kbuf)
 		return 0;
 
 	if (!IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK))
-		ret = kexec_walk_resources(kbuf, locate_mem_hole_callback);
+		ret = kexec_walk_resources(kbuf, kexec_locate_mem_hole_callback);
 	else
-		ret = kexec_walk_memblock(kbuf, locate_mem_hole_callback);
+		ret = kexec_walk_memblock(kbuf, kexec_locate_mem_hole_callback);
 
 	return ret == 1 ? 0 : -EADDRNOTAVAIL;
 }
-- 
2.30.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 3/5] kernel/resource: Allow find_next_iomem_res() to exclude overlapping child resources
  2021-05-31  9:57 ` Marc Zyngier
  (?)
@ 2021-05-31  9:57   ` Marc Zyngier
  -1 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-05-31  9:57 UTC (permalink / raw)
  To: kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	James Morse, Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla,
	Eric Biederman, Bhupesh SHARMA, AKASHI Takahiro, Dave Young,
	Andrew Morton, Moritz Fischer, kernel-team

find_next_iomem_res() returns the first resource that matches the
input parameters (range, flags, desc). It however ignores any
sub-resource that may invalidate the usefulness of such resource.

Allow find_next_iomem_res() to filter out such sub-resources and
wire it into the callers. As nobody is interested in this type
of filtering yet, there shouldn't be any functional change.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 kernel/resource.c | 67 ++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 58 insertions(+), 9 deletions(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index ca9f5198a01f..0e4d2ca763cd 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -318,6 +318,42 @@ int release_resource(struct resource *old)
 
 EXPORT_SYMBOL(release_resource);
 
+static int exclude_overlapping_child_res(struct resource *res,
+					 struct resource *child)
+{
+	struct resource cursor = *res;
+
+	for (; child; child = child->sibling) {
+		if (!resource_overlaps(&cursor, child))
+			continue;
+
+		if (cursor.start < child->start) {
+			*res = (struct resource) {
+				.start	= cursor.start,
+				.end	= child->start - 1,
+				.flags	= res->flags,
+				.desc	= res->desc,
+				.parent	= res->parent,
+			};
+
+			return 0;
+		}
+
+		/*
+		 * This may result in a resource with a negative size
+		 * at the very end of the loop.
+		 */
+		cursor.start = child->end + 1;
+	}
+
+	if (cursor.start <= cursor.end) {
+		*res = cursor;
+		return 0;
+	}
+
+	return -ENODEV;
+}
+
 /**
  * find_next_iomem_res - Finds the lowest iomem resource that covers part of
  *			 [@start..@end].
@@ -330,6 +366,7 @@ EXPORT_SYMBOL(release_resource);
  * @end:	end address of same resource
  * @flags:	flags which the resource must have
  * @desc:	descriptor the resource must have
+ * @exclude_child_res: exclude parts of resource that have overlapping children
  * @res:	return ptr, if resource found
  *
  * The caller must specify @start, @end, @flags, and @desc
@@ -337,7 +374,7 @@ EXPORT_SYMBOL(release_resource);
  */
 static int find_next_iomem_res(resource_size_t start, resource_size_t end,
 			       unsigned long flags, unsigned long desc,
-			       struct resource *res)
+			       bool exclude_child_res, struct resource *res)
 {
 	struct resource *p;
 
@@ -348,7 +385,7 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
 		return -EINVAL;
 
 	read_lock(&resource_lock);
-
+again:
 	for (p = iomem_resource.child; p; p = next_resource(p)) {
 		/* If we passed the resource we are looking for, stop */
 		if (p->start > end) {
@@ -378,6 +415,15 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
 			.desc = p->desc,
 			.parent = p->parent,
 		};
+
+		if (exclude_child_res &&
+		    exclude_overlapping_child_res(res, p->child)) {
+			start = res->end + 1;
+			if (start >= end)
+				p = NULL;
+			else
+				goto again;
+		}
 	}
 
 	read_unlock(&resource_lock);
@@ -386,6 +432,7 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
 
 static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
 				 unsigned long flags, unsigned long desc,
+				 bool exclude_child_res,
 				 void *arg,
 				 int (*func)(struct resource *, void *))
 {
@@ -393,7 +440,8 @@ static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
 	int ret = -EINVAL;
 
 	while (start < end &&
-	       !find_next_iomem_res(start, end, flags, desc, &res)) {
+	       !find_next_iomem_res(start, end, flags, desc,
+				    exclude_child_res, &res)) {
 		ret = (*func)(&res, arg);
 		if (ret)
 			break;
@@ -424,7 +472,7 @@ static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
 int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
 		u64 end, void *arg, int (*func)(struct resource *, void *))
 {
-	return __walk_iomem_res_desc(start, end, flags, desc, arg, func);
+	return __walk_iomem_res_desc(start, end, flags, desc, false, arg, func);
 }
 EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
 
@@ -440,8 +488,8 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
 {
 	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
 
-	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, arg,
-				     func);
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE,
+				     false, arg, func);
 }
 
 /*
@@ -453,8 +501,8 @@ int walk_mem_res(u64 start, u64 end, void *arg,
 {
 	unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;
 
-	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, arg,
-				     func);
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE,
+				     false, arg, func);
 }
 
 /*
@@ -475,7 +523,8 @@ int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
 	end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
 	flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
 	while (start < end &&
-	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE, &res)) {
+	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE,
+				    false, &res)) {
 		pfn = PFN_UP(res.start);
 		end_pfn = PFN_DOWN(res.end + 1);
 		if (end_pfn > pfn)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 3/5] kernel/resource: Allow find_next_iomem_res() to exclude overlapping child resources
@ 2021-05-31  9:57   ` Marc Zyngier
  0 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-05-31  9:57 UTC (permalink / raw)
  To: kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	James Morse, Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla,
	Eric Biederman, Bhupesh SHARMA, AKASHI Takahiro, Dave Young,
	Andrew Morton, Moritz Fischer, kernel-team

find_next_iomem_res() returns the first resource that matches the
input parameters (range, flags, desc). It however ignores any
sub-resource that may invalidate the usefulness of such resource.

Allow find_next_iomem_res() to filter out such sub-resources and
wire it into the callers. As nobody is interested in this type
of filtering yet, there shouldn't be any functional change.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 kernel/resource.c | 67 ++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 58 insertions(+), 9 deletions(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index ca9f5198a01f..0e4d2ca763cd 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -318,6 +318,42 @@ int release_resource(struct resource *old)
 
 EXPORT_SYMBOL(release_resource);
 
+static int exclude_overlapping_child_res(struct resource *res,
+					 struct resource *child)
+{
+	struct resource cursor = *res;
+
+	for (; child; child = child->sibling) {
+		if (!resource_overlaps(&cursor, child))
+			continue;
+
+		if (cursor.start < child->start) {
+			*res = (struct resource) {
+				.start	= cursor.start,
+				.end	= child->start - 1,
+				.flags	= res->flags,
+				.desc	= res->desc,
+				.parent	= res->parent,
+			};
+
+			return 0;
+		}
+
+		/*
+		 * This may result in a resource with a negative size
+		 * at the very end of the loop.
+		 */
+		cursor.start = child->end + 1;
+	}
+
+	if (cursor.start <= cursor.end) {
+		*res = cursor;
+		return 0;
+	}
+
+	return -ENODEV;
+}
+
 /**
  * find_next_iomem_res - Finds the lowest iomem resource that covers part of
  *			 [@start..@end].
@@ -330,6 +366,7 @@ EXPORT_SYMBOL(release_resource);
  * @end:	end address of same resource
  * @flags:	flags which the resource must have
  * @desc:	descriptor the resource must have
+ * @exclude_child_res: exclude parts of resource that have overlapping children
  * @res:	return ptr, if resource found
  *
  * The caller must specify @start, @end, @flags, and @desc
@@ -337,7 +374,7 @@ EXPORT_SYMBOL(release_resource);
  */
 static int find_next_iomem_res(resource_size_t start, resource_size_t end,
 			       unsigned long flags, unsigned long desc,
-			       struct resource *res)
+			       bool exclude_child_res, struct resource *res)
 {
 	struct resource *p;
 
@@ -348,7 +385,7 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
 		return -EINVAL;
 
 	read_lock(&resource_lock);
-
+again:
 	for (p = iomem_resource.child; p; p = next_resource(p)) {
 		/* If we passed the resource we are looking for, stop */
 		if (p->start > end) {
@@ -378,6 +415,15 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
 			.desc = p->desc,
 			.parent = p->parent,
 		};
+
+		if (exclude_child_res &&
+		    exclude_overlapping_child_res(res, p->child)) {
+			start = res->end + 1;
+			if (start >= end)
+				p = NULL;
+			else
+				goto again;
+		}
 	}
 
 	read_unlock(&resource_lock);
@@ -386,6 +432,7 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
 
 static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
 				 unsigned long flags, unsigned long desc,
+				 bool exclude_child_res,
 				 void *arg,
 				 int (*func)(struct resource *, void *))
 {
@@ -393,7 +440,8 @@ static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
 	int ret = -EINVAL;
 
 	while (start < end &&
-	       !find_next_iomem_res(start, end, flags, desc, &res)) {
+	       !find_next_iomem_res(start, end, flags, desc,
+				    exclude_child_res, &res)) {
 		ret = (*func)(&res, arg);
 		if (ret)
 			break;
@@ -424,7 +472,7 @@ static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
 int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
 		u64 end, void *arg, int (*func)(struct resource *, void *))
 {
-	return __walk_iomem_res_desc(start, end, flags, desc, arg, func);
+	return __walk_iomem_res_desc(start, end, flags, desc, false, arg, func);
 }
 EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
 
@@ -440,8 +488,8 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
 {
 	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
 
-	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, arg,
-				     func);
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE,
+				     false, arg, func);
 }
 
 /*
@@ -453,8 +501,8 @@ int walk_mem_res(u64 start, u64 end, void *arg,
 {
 	unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;
 
-	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, arg,
-				     func);
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE,
+				     false, arg, func);
 }
 
 /*
@@ -475,7 +523,8 @@ int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
 	end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
 	flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
 	while (start < end &&
-	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE, &res)) {
+	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE,
+				    false, &res)) {
 		pfn = PFN_UP(res.start);
 		end_pfn = PFN_DOWN(res.end + 1);
 		if (end_pfn > pfn)
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 3/5] kernel/resource: Allow find_next_iomem_res() to exclude overlapping child resources
@ 2021-05-31  9:57   ` Marc Zyngier
  0 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-05-31  9:57 UTC (permalink / raw)
  To: kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	James Morse, Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla,
	Eric Biederman, Bhupesh SHARMA, AKASHI Takahiro, Dave Young,
	Andrew Morton, Moritz Fischer, kernel-team

find_next_iomem_res() returns the first resource that matches the
input parameters (range, flags, desc). It however ignores any
sub-resource that may invalidate the usefulness of such resource.

Allow find_next_iomem_res() to filter out such sub-resources and
wire it into the callers. As nobody is interested in this type
of filtering yet, there shouldn't be any functional change.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 kernel/resource.c | 67 ++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 58 insertions(+), 9 deletions(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index ca9f5198a01f..0e4d2ca763cd 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -318,6 +318,42 @@ int release_resource(struct resource *old)
 
 EXPORT_SYMBOL(release_resource);
 
+static int exclude_overlapping_child_res(struct resource *res,
+					 struct resource *child)
+{
+	struct resource cursor = *res;
+
+	for (; child; child = child->sibling) {
+		if (!resource_overlaps(&cursor, child))
+			continue;
+
+		if (cursor.start < child->start) {
+			*res = (struct resource) {
+				.start	= cursor.start,
+				.end	= child->start - 1,
+				.flags	= res->flags,
+				.desc	= res->desc,
+				.parent	= res->parent,
+			};
+
+			return 0;
+		}
+
+		/*
+		 * This may result in a resource with a negative size
+		 * at the very end of the loop.
+		 */
+		cursor.start = child->end + 1;
+	}
+
+	if (cursor.start <= cursor.end) {
+		*res = cursor;
+		return 0;
+	}
+
+	return -ENODEV;
+}
+
 /**
  * find_next_iomem_res - Finds the lowest iomem resource that covers part of
  *			 [@start..@end].
@@ -330,6 +366,7 @@ EXPORT_SYMBOL(release_resource);
  * @end:	end address of same resource
  * @flags:	flags which the resource must have
  * @desc:	descriptor the resource must have
+ * @exclude_child_res: exclude parts of resource that have overlapping children
  * @res:	return ptr, if resource found
  *
  * The caller must specify @start, @end, @flags, and @desc
@@ -337,7 +374,7 @@ EXPORT_SYMBOL(release_resource);
  */
 static int find_next_iomem_res(resource_size_t start, resource_size_t end,
 			       unsigned long flags, unsigned long desc,
-			       struct resource *res)
+			       bool exclude_child_res, struct resource *res)
 {
 	struct resource *p;
 
@@ -348,7 +385,7 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
 		return -EINVAL;
 
 	read_lock(&resource_lock);
-
+again:
 	for (p = iomem_resource.child; p; p = next_resource(p)) {
 		/* If we passed the resource we are looking for, stop */
 		if (p->start > end) {
@@ -378,6 +415,15 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
 			.desc = p->desc,
 			.parent = p->parent,
 		};
+
+		if (exclude_child_res &&
+		    exclude_overlapping_child_res(res, p->child)) {
+			start = res->end + 1;
+			if (start >= end)
+				p = NULL;
+			else
+				goto again;
+		}
 	}
 
 	read_unlock(&resource_lock);
@@ -386,6 +432,7 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
 
 static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
 				 unsigned long flags, unsigned long desc,
+				 bool exclude_child_res,
 				 void *arg,
 				 int (*func)(struct resource *, void *))
 {
@@ -393,7 +440,8 @@ static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
 	int ret = -EINVAL;
 
 	while (start < end &&
-	       !find_next_iomem_res(start, end, flags, desc, &res)) {
+	       !find_next_iomem_res(start, end, flags, desc,
+				    exclude_child_res, &res)) {
 		ret = (*func)(&res, arg);
 		if (ret)
 			break;
@@ -424,7 +472,7 @@ static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
 int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
 		u64 end, void *arg, int (*func)(struct resource *, void *))
 {
-	return __walk_iomem_res_desc(start, end, flags, desc, arg, func);
+	return __walk_iomem_res_desc(start, end, flags, desc, false, arg, func);
 }
 EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
 
@@ -440,8 +488,8 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
 {
 	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
 
-	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, arg,
-				     func);
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE,
+				     false, arg, func);
 }
 
 /*
@@ -453,8 +501,8 @@ int walk_mem_res(u64 start, u64 end, void *arg,
 {
 	unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;
 
-	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, arg,
-				     func);
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE,
+				     false, arg, func);
 }
 
 /*
@@ -475,7 +523,8 @@ int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
 	end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
 	flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
 	while (start < end &&
-	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE, &res)) {
+	       !find_next_iomem_res(start, end, flags, IORES_DESC_NONE,
+				    false, &res)) {
 		pfn = PFN_UP(res.start);
 		end_pfn = PFN_DOWN(res.end + 1);
 		if (end_pfn > pfn)
-- 
2.30.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 4/5] kernel/resource: Introduce walk_system_ram_excluding_child_res()
  2021-05-31  9:57 ` Marc Zyngier
  (?)
@ 2021-05-31  9:57   ` Marc Zyngier
  -1 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-05-31  9:57 UTC (permalink / raw)
  To: kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	James Morse, Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla,
	Eric Biederman, Bhupesh SHARMA, AKASHI Takahiro, Dave Young,
	Andrew Morton, Moritz Fischer, kernel-team

Introduce a new helper called walk_system_ram_excluding_child_res(),
which does the same job as walk_system_ram_res() but excludes
overlapping child resources.

Again, nobody is interested in such a filtering, so no functional
change is expected.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 include/linux/ioport.h |  3 +++
 kernel/resource.c      | 15 +++++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index 8359c50f9988..f9638d085349 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -320,6 +320,9 @@ extern int
 walk_system_ram_res(u64 start, u64 end, void *arg,
 		    int (*func)(struct resource *, void *));
 extern int
+walk_system_ram_excluding_child_res(u64 start, u64 end, void *arg,
+				    int (*func)(struct resource *, void *));
+extern int
 walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
 		    void *arg, int (*func)(struct resource *, void *));
 
diff --git a/kernel/resource.c b/kernel/resource.c
index 0e4d2ca763cd..92b765eaba58 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -492,6 +492,21 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
 				     false, arg, func);
 }
 
+/*
+ * This function calls the @func callback against all memory ranges of type
+ * System RAM which are marked as IORESOURCE_SYSTEM_RAM and IORESOUCE_BUSY,
+ * excluding RAM ranges that have overlapping child resources.
+ * Same constraints as @walk_system_ram_res apply.
+ */
+int walk_system_ram_excluding_child_res(u64 start, u64 end, void *arg,
+					int (*func)(struct resource *, void *))
+{
+	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE,
+				     true, arg, func);
+}
+
 /*
  * This function calls the @func callback against all memory ranges, which
  * are ranges marked as IORESOURCE_MEM and IORESOUCE_BUSY.
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 4/5] kernel/resource: Introduce walk_system_ram_excluding_child_res()
@ 2021-05-31  9:57   ` Marc Zyngier
  0 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-05-31  9:57 UTC (permalink / raw)
  To: kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	James Morse, Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla,
	Eric Biederman, Bhupesh SHARMA, AKASHI Takahiro, Dave Young,
	Andrew Morton, Moritz Fischer, kernel-team

Introduce a new helper called walk_system_ram_excluding_child_res(),
which does the same job as walk_system_ram_res() but excludes
overlapping child resources.

Again, nobody is interested in such a filtering, so no functional
change is expected.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 include/linux/ioport.h |  3 +++
 kernel/resource.c      | 15 +++++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index 8359c50f9988..f9638d085349 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -320,6 +320,9 @@ extern int
 walk_system_ram_res(u64 start, u64 end, void *arg,
 		    int (*func)(struct resource *, void *));
 extern int
+walk_system_ram_excluding_child_res(u64 start, u64 end, void *arg,
+				    int (*func)(struct resource *, void *));
+extern int
 walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
 		    void *arg, int (*func)(struct resource *, void *));
 
diff --git a/kernel/resource.c b/kernel/resource.c
index 0e4d2ca763cd..92b765eaba58 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -492,6 +492,21 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
 				     false, arg, func);
 }
 
+/*
+ * This function calls the @func callback against all memory ranges of type
+ * System RAM which are marked as IORESOURCE_SYSTEM_RAM and IORESOUCE_BUSY,
+ * excluding RAM ranges that have overlapping child resources.
+ * Same constraints as @walk_system_ram_res apply.
+ */
+int walk_system_ram_excluding_child_res(u64 start, u64 end, void *arg,
+					int (*func)(struct resource *, void *))
+{
+	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE,
+				     true, arg, func);
+}
+
 /*
  * This function calls the @func callback against all memory ranges, which
  * are ranges marked as IORESOURCE_MEM and IORESOUCE_BUSY.
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 4/5] kernel/resource: Introduce walk_system_ram_excluding_child_res()
@ 2021-05-31  9:57   ` Marc Zyngier
  0 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-05-31  9:57 UTC (permalink / raw)
  To: kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	James Morse, Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla,
	Eric Biederman, Bhupesh SHARMA, AKASHI Takahiro, Dave Young,
	Andrew Morton, Moritz Fischer, kernel-team

Introduce a new helper called walk_system_ram_excluding_child_res(),
which does the same job as walk_system_ram_res() but excludes
overlapping child resources.

Again, nobody is interested in such a filtering, so no functional
change is expected.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 include/linux/ioport.h |  3 +++
 kernel/resource.c      | 15 +++++++++++++++
 2 files changed, 18 insertions(+)

diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index 8359c50f9988..f9638d085349 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -320,6 +320,9 @@ extern int
 walk_system_ram_res(u64 start, u64 end, void *arg,
 		    int (*func)(struct resource *, void *));
 extern int
+walk_system_ram_excluding_child_res(u64 start, u64 end, void *arg,
+				    int (*func)(struct resource *, void *));
+extern int
 walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start, u64 end,
 		    void *arg, int (*func)(struct resource *, void *));
 
diff --git a/kernel/resource.c b/kernel/resource.c
index 0e4d2ca763cd..92b765eaba58 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -492,6 +492,21 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
 				     false, arg, func);
 }
 
+/*
+ * This function calls the @func callback against all memory ranges of type
+ * System RAM which are marked as IORESOURCE_SYSTEM_RAM and IORESOUCE_BUSY,
+ * excluding RAM ranges that have overlapping child resources.
+ * Same constraints as @walk_system_ram_res apply.
+ */
+int walk_system_ram_excluding_child_res(u64 start, u64 end, void *arg,
+					int (*func)(struct resource *, void *))
+{
+	unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+
+	return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE,
+				     true, arg, func);
+}
+
 /*
  * This function calls the @func callback against all memory ranges, which
  * are ranges marked as IORESOURCE_MEM and IORESOUCE_BUSY.
-- 
2.30.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 5/5] arm64: kexec_image: Restore full kexec functionnality
  2021-05-31  9:57 ` Marc Zyngier
  (?)
@ 2021-05-31  9:57   ` Marc Zyngier
  -1 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-05-31  9:57 UTC (permalink / raw)
  To: kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	James Morse, Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla,
	Eric Biederman, Bhupesh SHARMA, AKASHI Takahiro, Dave Young,
	Andrew Morton, Moritz Fischer, kernel-team

Provide an arm64-specific implementation for arch_kexec_locate_mem_hole(),
using the resource tree instead of memblock, and respecting
the reservations added by EFI.

This ensures that kexec_file is finally reliable.

Reported-by: Moritz Fischer <mdf@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/kernel/kexec_image.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
index acf9cd251307..2a51a2ebd2b7 100644
--- a/arch/arm64/kernel/kexec_image.c
+++ b/arch/arm64/kernel/kexec_image.c
@@ -156,12 +156,31 @@ const struct kexec_file_ops kexec_image_ops = {
  */
 int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf)
 {
+	int ret;
+
+	/* Arch knows where to place */
+	if (kbuf->mem != KEXEC_BUF_MEM_UNKNOWN)
+		return 0;
+
 	/*
-	 * For the time being, kexec_file_load isn't reliable except
-	 * for crash kernel. Say sorry to the user.
+	 * Crash kernels land in a well known place that has been
+	 * reserved upfront.
+	 *
+	 * Normal kexec kernels can however land anywhere in memory.
+	 * We have to be extra careful not to step over critical
+	 * memory ranges that have been marked as reserved in the
+	 * iomem resource tree (LPI and ACPI tables, among others),
+	 * hence the use of the child-excluding iterator.  This
+	 * matches what the userspace version of kexec does.
 	 */
-	if (kbuf->image->type != KEXEC_TYPE_CRASH)
-		return -EADDRNOTAVAIL;
-
-	return kexec_locate_mem_hole(kbuf);
+	if (kbuf->image->type == KEXEC_TYPE_CRASH)
+		ret = walk_iomem_res_desc(crashk_res.desc,
+					  IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
+					  crashk_res.start, crashk_res.end,
+					  kbuf, kexec_locate_mem_hole_callback);
+	else
+		ret = walk_system_ram_excluding_child_res(0, ULONG_MAX, kbuf,
+							  kexec_locate_mem_hole_callback);
+
+	return ret == 1 ? 0 : -EADDRNOTAVAIL;
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 5/5] arm64: kexec_image: Restore full kexec functionnality
@ 2021-05-31  9:57   ` Marc Zyngier
  0 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-05-31  9:57 UTC (permalink / raw)
  To: kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	James Morse, Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla,
	Eric Biederman, Bhupesh SHARMA, AKASHI Takahiro, Dave Young,
	Andrew Morton, Moritz Fischer, kernel-team

Provide an arm64-specific implementation for arch_kexec_locate_mem_hole(),
using the resource tree instead of memblock, and respecting
the reservations added by EFI.

This ensures that kexec_file is finally reliable.

Reported-by: Moritz Fischer <mdf@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/kernel/kexec_image.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
index acf9cd251307..2a51a2ebd2b7 100644
--- a/arch/arm64/kernel/kexec_image.c
+++ b/arch/arm64/kernel/kexec_image.c
@@ -156,12 +156,31 @@ const struct kexec_file_ops kexec_image_ops = {
  */
 int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf)
 {
+	int ret;
+
+	/* Arch knows where to place */
+	if (kbuf->mem != KEXEC_BUF_MEM_UNKNOWN)
+		return 0;
+
 	/*
-	 * For the time being, kexec_file_load isn't reliable except
-	 * for crash kernel. Say sorry to the user.
+	 * Crash kernels land in a well known place that has been
+	 * reserved upfront.
+	 *
+	 * Normal kexec kernels can however land anywhere in memory.
+	 * We have to be extra careful not to step over critical
+	 * memory ranges that have been marked as reserved in the
+	 * iomem resource tree (LPI and ACPI tables, among others),
+	 * hence the use of the child-excluding iterator.  This
+	 * matches what the userspace version of kexec does.
 	 */
-	if (kbuf->image->type != KEXEC_TYPE_CRASH)
-		return -EADDRNOTAVAIL;
-
-	return kexec_locate_mem_hole(kbuf);
+	if (kbuf->image->type == KEXEC_TYPE_CRASH)
+		ret = walk_iomem_res_desc(crashk_res.desc,
+					  IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
+					  crashk_res.start, crashk_res.end,
+					  kbuf, kexec_locate_mem_hole_callback);
+	else
+		ret = walk_system_ram_excluding_child_res(0, ULONG_MAX, kbuf,
+							  kexec_locate_mem_hole_callback);
+
+	return ret == 1 ? 0 : -EADDRNOTAVAIL;
 }
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 5/5] arm64: kexec_image: Restore full kexec functionnality
@ 2021-05-31  9:57   ` Marc Zyngier
  0 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-05-31  9:57 UTC (permalink / raw)
  To: kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	James Morse, Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla,
	Eric Biederman, Bhupesh SHARMA, AKASHI Takahiro, Dave Young,
	Andrew Morton, Moritz Fischer, kernel-team

Provide an arm64-specific implementation for arch_kexec_locate_mem_hole(),
using the resource tree instead of memblock, and respecting
the reservations added by EFI.

This ensures that kexec_file is finally reliable.

Reported-by: Moritz Fischer <mdf@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/kernel/kexec_image.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
index acf9cd251307..2a51a2ebd2b7 100644
--- a/arch/arm64/kernel/kexec_image.c
+++ b/arch/arm64/kernel/kexec_image.c
@@ -156,12 +156,31 @@ const struct kexec_file_ops kexec_image_ops = {
  */
 int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf)
 {
+	int ret;
+
+	/* Arch knows where to place */
+	if (kbuf->mem != KEXEC_BUF_MEM_UNKNOWN)
+		return 0;
+
 	/*
-	 * For the time being, kexec_file_load isn't reliable except
-	 * for crash kernel. Say sorry to the user.
+	 * Crash kernels land in a well known place that has been
+	 * reserved upfront.
+	 *
+	 * Normal kexec kernels can however land anywhere in memory.
+	 * We have to be extra careful not to step over critical
+	 * memory ranges that have been marked as reserved in the
+	 * iomem resource tree (LPI and ACPI tables, among others),
+	 * hence the use of the child-excluding iterator.  This
+	 * matches what the userspace version of kexec does.
 	 */
-	if (kbuf->image->type != KEXEC_TYPE_CRASH)
-		return -EADDRNOTAVAIL;
-
-	return kexec_locate_mem_hole(kbuf);
+	if (kbuf->image->type == KEXEC_TYPE_CRASH)
+		ret = walk_iomem_res_desc(crashk_res.desc,
+					  IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
+					  crashk_res.start, crashk_res.end,
+					  kbuf, kexec_locate_mem_hole_callback);
+	else
+		ret = walk_system_ram_excluding_child_res(0, ULONG_MAX, kbuf,
+							  kexec_locate_mem_hole_callback);
+
+	return ret == 1 ? 0 : -EADDRNOTAVAIL;
 }
-- 
2.30.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations
  2021-05-31  9:57 ` Marc Zyngier
  (?)
@ 2021-05-31 19:36   ` Ard Biesheuvel
  -1 siblings, 0 replies; 36+ messages in thread
From: Ard Biesheuvel @ 2021-05-31 19:36 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kexec, Linux ARM, Linux Kernel Mailing List, Catalin Marinas,
	Will Deacon, Mark Rutland, James Morse, Lorenzo Pieralisi,
	Hanjun Guo, Sudeep Holla, Eric Biederman, Bhupesh SHARMA,
	AKASHI Takahiro, Dave Young, Andrew Morton, Moritz Fischer,
	Android Kernel Team

On Mon, 31 May 2021 at 11:57, Marc Zyngier <maz@kernel.org> wrote:
>
> This series is a complete departure from the approach I initially sent
> almost a month ago[0]. Instead of trying to teach EFI, ACPI and other
> subsystem to use memblock, I've decided to stick with the iomem
> resource tree and use that exclusively for arm64.
>
> This means that my current approach is (despite what I initially
> replied to both Dave and Catalin) to provide an arm64-specific
> implementation of arch_kexec_locate_mem_hole() which walks the
> resource tree and excludes ranges of RAM that have been registered for
> any odd purpose. This is exactly what the userspace implementation
> does, and I don't really see a good reason to diverge from it.
>
> Again, this allows my Synquacer board to reliably use kexec_file_load
> with as little as 256M, something that would always fail before as it
> would overwrite most of the reserved tables.
>
> Although this series still targets 5.14, the initial patch is a
> -stable candidate, and disables non-kdump uses of kexec_file_load. I
> have limited it to 5.10, as earlier kernels will require a different,
> probably more invasive approach.
>
> Catalin, Ard: although this series has changed a bit compared to v1,
> I've kept your AB/RB tags. Should anything seem odd, please let me
> know and I'll drop them.
>

Fine with me.

> Thanks,
>
>         M.
>
> * From v1 [1]:
>   - Move the overlap exclusion into find_next_iomem_res()
>   - Handle child resource not overlapping with parent
>   - Provide walk_system_ram_excluding_child_res() as a top level
>     walker
>   - Simplify arch-specific code
>   - Add initial patch disabling non-crash kernels
>
> [0] https://lore.kernel.org/r/20210429133533.1750721-1-maz@kernel.org
> [1] https://lore.kernel.org/r/20210526190531.62751-1-maz@kernel.org
>
> Marc Zyngier (5):
>   arm64: kexec_file: Forbid non-crash kernels
>   kexec_file: Make locate_mem_hole_callback global
>   kernel/resource: Allow find_next_iomem_res() to exclude overlapping
>     child resources
>   kernel/resource: Introduce walk_system_ram_excluding_child_res()
>   arm64: kexec_image: Restore full kexec functionnality
>
>  arch/arm64/kernel/kexec_image.c | 39 ++++++++++++++++
>  include/linux/ioport.h          |  3 ++
>  include/linux/kexec.h           |  1 +
>  kernel/kexec_file.c             |  6 +--
>  kernel/resource.c               | 82 +++++++++++++++++++++++++++++----
>  5 files changed, 119 insertions(+), 12 deletions(-)
>
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations
@ 2021-05-31 19:36   ` Ard Biesheuvel
  0 siblings, 0 replies; 36+ messages in thread
From: Ard Biesheuvel @ 2021-05-31 19:36 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kexec, Linux ARM, Linux Kernel Mailing List, Catalin Marinas,
	Will Deacon, Mark Rutland, James Morse, Lorenzo Pieralisi,
	Hanjun Guo, Sudeep Holla, Eric Biederman, Bhupesh SHARMA,
	AKASHI Takahiro, Dave Young, Andrew Morton, Moritz Fischer,
	Android Kernel Team

On Mon, 31 May 2021 at 11:57, Marc Zyngier <maz@kernel.org> wrote:
>
> This series is a complete departure from the approach I initially sent
> almost a month ago[0]. Instead of trying to teach EFI, ACPI and other
> subsystem to use memblock, I've decided to stick with the iomem
> resource tree and use that exclusively for arm64.
>
> This means that my current approach is (despite what I initially
> replied to both Dave and Catalin) to provide an arm64-specific
> implementation of arch_kexec_locate_mem_hole() which walks the
> resource tree and excludes ranges of RAM that have been registered for
> any odd purpose. This is exactly what the userspace implementation
> does, and I don't really see a good reason to diverge from it.
>
> Again, this allows my Synquacer board to reliably use kexec_file_load
> with as little as 256M, something that would always fail before as it
> would overwrite most of the reserved tables.
>
> Although this series still targets 5.14, the initial patch is a
> -stable candidate, and disables non-kdump uses of kexec_file_load. I
> have limited it to 5.10, as earlier kernels will require a different,
> probably more invasive approach.
>
> Catalin, Ard: although this series has changed a bit compared to v1,
> I've kept your AB/RB tags. Should anything seem odd, please let me
> know and I'll drop them.
>

Fine with me.

> Thanks,
>
>         M.
>
> * From v1 [1]:
>   - Move the overlap exclusion into find_next_iomem_res()
>   - Handle child resource not overlapping with parent
>   - Provide walk_system_ram_excluding_child_res() as a top level
>     walker
>   - Simplify arch-specific code
>   - Add initial patch disabling non-crash kernels
>
> [0] https://lore.kernel.org/r/20210429133533.1750721-1-maz@kernel.org
> [1] https://lore.kernel.org/r/20210526190531.62751-1-maz@kernel.org
>
> Marc Zyngier (5):
>   arm64: kexec_file: Forbid non-crash kernels
>   kexec_file: Make locate_mem_hole_callback global
>   kernel/resource: Allow find_next_iomem_res() to exclude overlapping
>     child resources
>   kernel/resource: Introduce walk_system_ram_excluding_child_res()
>   arm64: kexec_image: Restore full kexec functionnality
>
>  arch/arm64/kernel/kexec_image.c | 39 ++++++++++++++++
>  include/linux/ioport.h          |  3 ++
>  include/linux/kexec.h           |  1 +
>  kernel/kexec_file.c             |  6 +--
>  kernel/resource.c               | 82 +++++++++++++++++++++++++++++----
>  5 files changed, 119 insertions(+), 12 deletions(-)
>
> --
> 2.30.2
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations
@ 2021-05-31 19:36   ` Ard Biesheuvel
  0 siblings, 0 replies; 36+ messages in thread
From: Ard Biesheuvel @ 2021-05-31 19:36 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kexec, Linux ARM, Linux Kernel Mailing List, Catalin Marinas,
	Will Deacon, Mark Rutland, James Morse, Lorenzo Pieralisi,
	Hanjun Guo, Sudeep Holla, Eric Biederman, Bhupesh SHARMA,
	AKASHI Takahiro, Dave Young, Andrew Morton, Moritz Fischer,
	Android Kernel Team

On Mon, 31 May 2021 at 11:57, Marc Zyngier <maz@kernel.org> wrote:
>
> This series is a complete departure from the approach I initially sent
> almost a month ago[0]. Instead of trying to teach EFI, ACPI and other
> subsystem to use memblock, I've decided to stick with the iomem
> resource tree and use that exclusively for arm64.
>
> This means that my current approach is (despite what I initially
> replied to both Dave and Catalin) to provide an arm64-specific
> implementation of arch_kexec_locate_mem_hole() which walks the
> resource tree and excludes ranges of RAM that have been registered for
> any odd purpose. This is exactly what the userspace implementation
> does, and I don't really see a good reason to diverge from it.
>
> Again, this allows my Synquacer board to reliably use kexec_file_load
> with as little as 256M, something that would always fail before as it
> would overwrite most of the reserved tables.
>
> Although this series still targets 5.14, the initial patch is a
> -stable candidate, and disables non-kdump uses of kexec_file_load. I
> have limited it to 5.10, as earlier kernels will require a different,
> probably more invasive approach.
>
> Catalin, Ard: although this series has changed a bit compared to v1,
> I've kept your AB/RB tags. Should anything seem odd, please let me
> know and I'll drop them.
>

Fine with me.

> Thanks,
>
>         M.
>
> * From v1 [1]:
>   - Move the overlap exclusion into find_next_iomem_res()
>   - Handle child resource not overlapping with parent
>   - Provide walk_system_ram_excluding_child_res() as a top level
>     walker
>   - Simplify arch-specific code
>   - Add initial patch disabling non-crash kernels
>
> [0] https://lore.kernel.org/r/20210429133533.1750721-1-maz@kernel.org
> [1] https://lore.kernel.org/r/20210526190531.62751-1-maz@kernel.org
>
> Marc Zyngier (5):
>   arm64: kexec_file: Forbid non-crash kernels
>   kexec_file: Make locate_mem_hole_callback global
>   kernel/resource: Allow find_next_iomem_res() to exclude overlapping
>     child resources
>   kernel/resource: Introduce walk_system_ram_excluding_child_res()
>   arm64: kexec_image: Restore full kexec functionnality
>
>  arch/arm64/kernel/kexec_image.c | 39 ++++++++++++++++
>  include/linux/ioport.h          |  3 ++
>  include/linux/kexec.h           |  1 +
>  kernel/kexec_file.c             |  6 +--
>  kernel/resource.c               | 82 +++++++++++++++++++++++++++++----
>  5 files changed, 119 insertions(+), 12 deletions(-)
>
> --
> 2.30.2
>

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 1/5] arm64: kexec_file: Forbid non-crash kernels
  2021-05-31  9:57   ` Marc Zyngier
  (?)
@ 2021-05-31 19:37     ` Ard Biesheuvel
  -1 siblings, 0 replies; 36+ messages in thread
From: Ard Biesheuvel @ 2021-05-31 19:37 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kexec, Linux ARM, Linux Kernel Mailing List, Catalin Marinas,
	Will Deacon, Mark Rutland, James Morse, Lorenzo Pieralisi,
	Hanjun Guo, Sudeep Holla, Eric Biederman, Bhupesh SHARMA,
	AKASHI Takahiro, Dave Young, Andrew Morton, Moritz Fischer,
	Android Kernel Team, # 3.4.x

On Mon, 31 May 2021 at 11:57, Marc Zyngier <maz@kernel.org> wrote:
>
> It has been reported that kexec_file doesn't really work on arm64.
> It completely ignores any of the existing reservations, which results
> in the secondary kernel being loaded where the GICv3 LPI tables live,
> or even corrupting the ACPI tables.
>
> Since only crash kernels are imune to this as they use a reserved
> memory region, disable the non-crash kernel use case. Further
> patches will try and restore the functionality.
>
> Reported-by: Moritz Fischer <mdf@kernel.org>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Cc: stable@vger.kernel.org # 5.10

Acked-by: Ard Biesheuvel <ardb@kernel.org>

... but do we really only need this in 5.10 and not earlier?

> ---
>  arch/arm64/kernel/kexec_image.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
> index 9ec34690e255..acf9cd251307 100644
> --- a/arch/arm64/kernel/kexec_image.c
> +++ b/arch/arm64/kernel/kexec_image.c
> @@ -145,3 +145,23 @@ const struct kexec_file_ops kexec_image_ops = {
>         .verify_sig = image_verify_sig,
>  #endif
>  };
> +
> +/**
> + * arch_kexec_locate_mem_hole - Find free memory to place the segments.
> + * @kbuf:                       Parameters for the memory search.
> + *
> + * On success, kbuf->mem will have the start address of the memory region found.
> + *
> + * Return: 0 on success, negative errno on error.
> + */
> +int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf)
> +{
> +       /*
> +        * For the time being, kexec_file_load isn't reliable except
> +        * for crash kernel. Say sorry to the user.
> +        */
> +       if (kbuf->image->type != KEXEC_TYPE_CRASH)
> +               return -EADDRNOTAVAIL;
> +
> +       return kexec_locate_mem_hole(kbuf);
> +}
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 1/5] arm64: kexec_file: Forbid non-crash kernels
@ 2021-05-31 19:37     ` Ard Biesheuvel
  0 siblings, 0 replies; 36+ messages in thread
From: Ard Biesheuvel @ 2021-05-31 19:37 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kexec, Linux ARM, Linux Kernel Mailing List, Catalin Marinas,
	Will Deacon, Mark Rutland, James Morse, Lorenzo Pieralisi,
	Hanjun Guo, Sudeep Holla, Eric Biederman, Bhupesh SHARMA,
	AKASHI Takahiro, Dave Young, Andrew Morton, Moritz Fischer,
	Android Kernel Team, # 3.4.x

On Mon, 31 May 2021 at 11:57, Marc Zyngier <maz@kernel.org> wrote:
>
> It has been reported that kexec_file doesn't really work on arm64.
> It completely ignores any of the existing reservations, which results
> in the secondary kernel being loaded where the GICv3 LPI tables live,
> or even corrupting the ACPI tables.
>
> Since only crash kernels are imune to this as they use a reserved
> memory region, disable the non-crash kernel use case. Further
> patches will try and restore the functionality.
>
> Reported-by: Moritz Fischer <mdf@kernel.org>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Cc: stable@vger.kernel.org # 5.10

Acked-by: Ard Biesheuvel <ardb@kernel.org>

... but do we really only need this in 5.10 and not earlier?

> ---
>  arch/arm64/kernel/kexec_image.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
> index 9ec34690e255..acf9cd251307 100644
> --- a/arch/arm64/kernel/kexec_image.c
> +++ b/arch/arm64/kernel/kexec_image.c
> @@ -145,3 +145,23 @@ const struct kexec_file_ops kexec_image_ops = {
>         .verify_sig = image_verify_sig,
>  #endif
>  };
> +
> +/**
> + * arch_kexec_locate_mem_hole - Find free memory to place the segments.
> + * @kbuf:                       Parameters for the memory search.
> + *
> + * On success, kbuf->mem will have the start address of the memory region found.
> + *
> + * Return: 0 on success, negative errno on error.
> + */
> +int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf)
> +{
> +       /*
> +        * For the time being, kexec_file_load isn't reliable except
> +        * for crash kernel. Say sorry to the user.
> +        */
> +       if (kbuf->image->type != KEXEC_TYPE_CRASH)
> +               return -EADDRNOTAVAIL;
> +
> +       return kexec_locate_mem_hole(kbuf);
> +}
> --
> 2.30.2
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 1/5] arm64: kexec_file: Forbid non-crash kernels
@ 2021-05-31 19:37     ` Ard Biesheuvel
  0 siblings, 0 replies; 36+ messages in thread
From: Ard Biesheuvel @ 2021-05-31 19:37 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kexec, Linux ARM, Linux Kernel Mailing List, Catalin Marinas,
	Will Deacon, Mark Rutland, James Morse, Lorenzo Pieralisi,
	Hanjun Guo, Sudeep Holla, Eric Biederman, Bhupesh SHARMA,
	AKASHI Takahiro, Dave Young, Andrew Morton, Moritz Fischer,
	Android Kernel Team, # 3.4.x

On Mon, 31 May 2021 at 11:57, Marc Zyngier <maz@kernel.org> wrote:
>
> It has been reported that kexec_file doesn't really work on arm64.
> It completely ignores any of the existing reservations, which results
> in the secondary kernel being loaded where the GICv3 LPI tables live,
> or even corrupting the ACPI tables.
>
> Since only crash kernels are imune to this as they use a reserved
> memory region, disable the non-crash kernel use case. Further
> patches will try and restore the functionality.
>
> Reported-by: Moritz Fischer <mdf@kernel.org>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Cc: stable@vger.kernel.org # 5.10

Acked-by: Ard Biesheuvel <ardb@kernel.org>

... but do we really only need this in 5.10 and not earlier?

> ---
>  arch/arm64/kernel/kexec_image.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/arch/arm64/kernel/kexec_image.c b/arch/arm64/kernel/kexec_image.c
> index 9ec34690e255..acf9cd251307 100644
> --- a/arch/arm64/kernel/kexec_image.c
> +++ b/arch/arm64/kernel/kexec_image.c
> @@ -145,3 +145,23 @@ const struct kexec_file_ops kexec_image_ops = {
>         .verify_sig = image_verify_sig,
>  #endif
>  };
> +
> +/**
> + * arch_kexec_locate_mem_hole - Find free memory to place the segments.
> + * @kbuf:                       Parameters for the memory search.
> + *
> + * On success, kbuf->mem will have the start address of the memory region found.
> + *
> + * Return: 0 on success, negative errno on error.
> + */
> +int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf)
> +{
> +       /*
> +        * For the time being, kexec_file_load isn't reliable except
> +        * for crash kernel. Say sorry to the user.
> +        */
> +       if (kbuf->image->type != KEXEC_TYPE_CRASH)
> +               return -EADDRNOTAVAIL;
> +
> +       return kexec_locate_mem_hole(kbuf);
> +}
> --
> 2.30.2
>

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 1/5] arm64: kexec_file: Forbid non-crash kernels
  2021-05-31 19:37     ` Ard Biesheuvel
  (?)
@ 2021-06-01  8:36       ` Marc Zyngier
  -1 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-06-01  8:36 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: kexec, Linux ARM, Linux Kernel Mailing List, Catalin Marinas,
	Will Deacon, Mark Rutland, James Morse, Lorenzo Pieralisi,
	Hanjun Guo, Sudeep Holla, Eric Biederman, Bhupesh SHARMA,
	AKASHI Takahiro, Dave Young, Andrew Morton, Moritz Fischer,
	Android Kernel Team, # 3.4.x

On Mon, 31 May 2021 20:37:49 +0100,
Ard Biesheuvel <ardb@kernel.org> wrote:
> 
> On Mon, 31 May 2021 at 11:57, Marc Zyngier <maz@kernel.org> wrote:
> >
> > It has been reported that kexec_file doesn't really work on arm64.
> > It completely ignores any of the existing reservations, which results
> > in the secondary kernel being loaded where the GICv3 LPI tables live,
> > or even corrupting the ACPI tables.
> >
> > Since only crash kernels are imune to this as they use a reserved
> > memory region, disable the non-crash kernel use case. Further
> > patches will try and restore the functionality.
> >
> > Reported-by: Moritz Fischer <mdf@kernel.org>
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > Cc: stable@vger.kernel.org # 5.10
> 
> Acked-by: Ard Biesheuvel <ardb@kernel.org>
> 
> ... but do we really only need this in 5.10 and not earlier?

We *do* need something in earlier kernel (as mentioned in the cover
letter), but not this patch (arch_kexec_locate_mem_hole doesn't exist
there, so there is nothing to override).

I guess that completely disabling CONFIG_KEXEC_FILE on arm64 is the
way to go for 5.4 and earlier, as I don't think there is any crash
kernel support there.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 1/5] arm64: kexec_file: Forbid non-crash kernels
@ 2021-06-01  8:36       ` Marc Zyngier
  0 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-06-01  8:36 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: kexec, Linux ARM, Linux Kernel Mailing List, Catalin Marinas,
	Will Deacon, Mark Rutland, James Morse, Lorenzo Pieralisi,
	Hanjun Guo, Sudeep Holla, Eric Biederman, Bhupesh SHARMA,
	AKASHI Takahiro, Dave Young, Andrew Morton, Moritz Fischer,
	Android Kernel Team, # 3.4.x

On Mon, 31 May 2021 20:37:49 +0100,
Ard Biesheuvel <ardb@kernel.org> wrote:
> 
> On Mon, 31 May 2021 at 11:57, Marc Zyngier <maz@kernel.org> wrote:
> >
> > It has been reported that kexec_file doesn't really work on arm64.
> > It completely ignores any of the existing reservations, which results
> > in the secondary kernel being loaded where the GICv3 LPI tables live,
> > or even corrupting the ACPI tables.
> >
> > Since only crash kernels are imune to this as they use a reserved
> > memory region, disable the non-crash kernel use case. Further
> > patches will try and restore the functionality.
> >
> > Reported-by: Moritz Fischer <mdf@kernel.org>
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > Cc: stable@vger.kernel.org # 5.10
> 
> Acked-by: Ard Biesheuvel <ardb@kernel.org>
> 
> ... but do we really only need this in 5.10 and not earlier?

We *do* need something in earlier kernel (as mentioned in the cover
letter), but not this patch (arch_kexec_locate_mem_hole doesn't exist
there, so there is nothing to override).

I guess that completely disabling CONFIG_KEXEC_FILE on arm64 is the
way to go for 5.4 and earlier, as I don't think there is any crash
kernel support there.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 1/5] arm64: kexec_file: Forbid non-crash kernels
@ 2021-06-01  8:36       ` Marc Zyngier
  0 siblings, 0 replies; 36+ messages in thread
From: Marc Zyngier @ 2021-06-01  8:36 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: kexec, Linux ARM, Linux Kernel Mailing List, Catalin Marinas,
	Will Deacon, Mark Rutland, James Morse, Lorenzo Pieralisi,
	Hanjun Guo, Sudeep Holla, Eric Biederman, Bhupesh SHARMA,
	AKASHI Takahiro, Dave Young, Andrew Morton, Moritz Fischer,
	Android Kernel Team, # 3.4.x

On Mon, 31 May 2021 20:37:49 +0100,
Ard Biesheuvel <ardb@kernel.org> wrote:
> 
> On Mon, 31 May 2021 at 11:57, Marc Zyngier <maz@kernel.org> wrote:
> >
> > It has been reported that kexec_file doesn't really work on arm64.
> > It completely ignores any of the existing reservations, which results
> > in the secondary kernel being loaded where the GICv3 LPI tables live,
> > or even corrupting the ACPI tables.
> >
> > Since only crash kernels are imune to this as they use a reserved
> > memory region, disable the non-crash kernel use case. Further
> > patches will try and restore the functionality.
> >
> > Reported-by: Moritz Fischer <mdf@kernel.org>
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > Cc: stable@vger.kernel.org # 5.10
> 
> Acked-by: Ard Biesheuvel <ardb@kernel.org>
> 
> ... but do we really only need this in 5.10 and not earlier?

We *do* need something in earlier kernel (as mentioned in the cover
letter), but not this patch (arch_kexec_locate_mem_hole doesn't exist
there, so there is nothing to override).

I guess that completely disabling CONFIG_KEXEC_FILE on arm64 is the
way to go for 5.4 and earlier, as I don't think there is any crash
kernel support there.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 1/5] arm64: kexec_file: Forbid non-crash kernels
  2021-05-31  9:57   ` Marc Zyngier
  (?)
@ 2021-06-04 16:20     ` James Morse
  -1 siblings, 0 replies; 36+ messages in thread
From: James Morse @ 2021-06-04 16:20 UTC (permalink / raw)
  To: Marc Zyngier, kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Eric Biederman,
	Bhupesh SHARMA, AKASHI Takahiro, Dave Young, Andrew Morton,
	Moritz Fischer, kernel-team, stable

On 31/05/2021 10:57, Marc Zyngier wrote:
> It has been reported that kexec_file doesn't really work on arm64.
> It completely ignores any of the existing reservations, which results
> in the secondary kernel being loaded where the GICv3 LPI tables live,


> or even corrupting the ACPI tables.

I'd like to know how the ACPI tables bit happens.

ACPI tables should be in EFI_ACPI_RECLAIM_MEMORY or EFI_ACPI_MEMORY_NVS (which isn't
treated as usable).

EFI's reserve_regions() does this:
|	if (!is_usable_memory(md))
|		memblock_mark_nomap(paddr, size);
|
|	/* keep ACPI reclaim memory intact for kexec etc. */
|	if (md->type == EFI_ACPI_RECLAIM_MEMORY)
|		memblock_reserve(paddr, size);

which is called via efi_init(), and all those regions end up listed as reserved in
/proc/iomem. (this is why arm64 doesn't call acpi_reserve_initial_tables())

If your firmware puts ACPI tables are in EFI_CONVENTIONAL_MEMORY, you have bigger problems
as the kernel could get relocated over the top of them during boot, and even if it
doesn't, nothing stops that  memory being allocated for user-space.

Even acpi_table_upgrade() calls memblock_reserve() and happens early enough not to be a
problem.


Please share ... enjoyment, optional.

(boot with efi=debug and post the EFI memory map and the 'ACPI: FOO 0xphysicaladdress'
stuff at the top of the boot log)


Thanks,

James


> Since only crash kernels are imune to this as they use a reserved
> memory region, disable the non-crash kernel use case. Further
> patches will try and restore the functionality.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 1/5] arm64: kexec_file: Forbid non-crash kernels
@ 2021-06-04 16:20     ` James Morse
  0 siblings, 0 replies; 36+ messages in thread
From: James Morse @ 2021-06-04 16:20 UTC (permalink / raw)
  To: Marc Zyngier, kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Eric Biederman,
	Bhupesh SHARMA, AKASHI Takahiro, Dave Young, Andrew Morton,
	Moritz Fischer, kernel-team, stable

On 31/05/2021 10:57, Marc Zyngier wrote:
> It has been reported that kexec_file doesn't really work on arm64.
> It completely ignores any of the existing reservations, which results
> in the secondary kernel being loaded where the GICv3 LPI tables live,


> or even corrupting the ACPI tables.

I'd like to know how the ACPI tables bit happens.

ACPI tables should be in EFI_ACPI_RECLAIM_MEMORY or EFI_ACPI_MEMORY_NVS (which isn't
treated as usable).

EFI's reserve_regions() does this:
|	if (!is_usable_memory(md))
|		memblock_mark_nomap(paddr, size);
|
|	/* keep ACPI reclaim memory intact for kexec etc. */
|	if (md->type == EFI_ACPI_RECLAIM_MEMORY)
|		memblock_reserve(paddr, size);

which is called via efi_init(), and all those regions end up listed as reserved in
/proc/iomem. (this is why arm64 doesn't call acpi_reserve_initial_tables())

If your firmware puts ACPI tables are in EFI_CONVENTIONAL_MEMORY, you have bigger problems
as the kernel could get relocated over the top of them during boot, and even if it
doesn't, nothing stops that  memory being allocated for user-space.

Even acpi_table_upgrade() calls memblock_reserve() and happens early enough not to be a
problem.


Please share ... enjoyment, optional.

(boot with efi=debug and post the EFI memory map and the 'ACPI: FOO 0xphysicaladdress'
stuff at the top of the boot log)


Thanks,

James


> Since only crash kernels are imune to this as they use a reserved
> memory region, disable the non-crash kernel use case. Further
> patches will try and restore the functionality.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 1/5] arm64: kexec_file: Forbid non-crash kernels
@ 2021-06-04 16:20     ` James Morse
  0 siblings, 0 replies; 36+ messages in thread
From: James Morse @ 2021-06-04 16:20 UTC (permalink / raw)
  To: Marc Zyngier, kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Eric Biederman,
	Bhupesh SHARMA, AKASHI Takahiro, Dave Young, Andrew Morton,
	Moritz Fischer, kernel-team, stable

On 31/05/2021 10:57, Marc Zyngier wrote:
> It has been reported that kexec_file doesn't really work on arm64.
> It completely ignores any of the existing reservations, which results
> in the secondary kernel being loaded where the GICv3 LPI tables live,


> or even corrupting the ACPI tables.

I'd like to know how the ACPI tables bit happens.

ACPI tables should be in EFI_ACPI_RECLAIM_MEMORY or EFI_ACPI_MEMORY_NVS (which isn't
treated as usable).

EFI's reserve_regions() does this:
|	if (!is_usable_memory(md))
|		memblock_mark_nomap(paddr, size);
|
|	/* keep ACPI reclaim memory intact for kexec etc. */
|	if (md->type == EFI_ACPI_RECLAIM_MEMORY)
|		memblock_reserve(paddr, size);

which is called via efi_init(), and all those regions end up listed as reserved in
/proc/iomem. (this is why arm64 doesn't call acpi_reserve_initial_tables())

If your firmware puts ACPI tables are in EFI_CONVENTIONAL_MEMORY, you have bigger problems
as the kernel could get relocated over the top of them during boot, and even if it
doesn't, nothing stops that  memory being allocated for user-space.

Even acpi_table_upgrade() calls memblock_reserve() and happens early enough not to be a
problem.


Please share ... enjoyment, optional.

(boot with efi=debug and post the EFI memory map and the 'ACPI: FOO 0xphysicaladdress'
stuff at the top of the boot log)


Thanks,

James


> Since only crash kernels are imune to this as they use a reserved
> memory region, disable the non-crash kernel use case. Further
> patches will try and restore the functionality.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations
  2021-05-31  9:57 ` Marc Zyngier
  (?)
@ 2021-06-04 16:20   ` James Morse
  -1 siblings, 0 replies; 36+ messages in thread
From: James Morse @ 2021-06-04 16:20 UTC (permalink / raw)
  To: Marc Zyngier, kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Eric Biederman,
	Bhupesh SHARMA, AKASHI Takahiro, Dave Young, Andrew Morton,
	Moritz Fischer, kernel-team

Hi Marc,

On 31/05/2021 10:57, Marc Zyngier wrote:
> This series is a complete departure from the approach I initially sent
> almost a month ago[0]. Instead of trying to teach EFI, ACPI and other
> subsystem to use memblock, I've decided to stick with the iomem
> resource tree and use that exclusively for arm64.

> This means that my current approach is (despite what I initially
> replied to both Dave and Catalin) to provide an arm64-specific
> implementation of arch_kexec_locate_mem_hole() which walks the
> resource tree and excludes ranges of RAM that have been registered for
> any odd purpose. This is exactly what the userspace implementation
> does, and I don't really see a good reason to diverge from it.

Because in the ideal world we'd have only 'is it reserved' list to check against.
Memblock has been extended before. The resource-list is overly stringy, and I'm not sure
we can shove everything in the resource list.

Kexec already has problems on arm64 with memory hotplug. Fixing this for regular kexec in
/proc/iomem was rejected, and memblock's memblock_is_hotpluggable() is broken because
free_low_memory_core_early() does this:
|	memblock_clear_hotplug(0, -1)

Once that has been unpicked its clear kexec_file_load() can use
memblock_is_hotpluggable(). (its on the todo list, well, jira)


I'd prefer to keep kexec using memblock because it _shouldn't_ change after boot. Having
an "I want to reserve this and make it persistent over kexec" call that can happen at any
time can't work if the kexec image has already been loaded.
Practically, once user-space has started, you can't have new things you want to reserve
over kexec.


I don't see how the ACPI tables can escape short of a firmware bug. Could someone with an
affected platform boot with efi=debug and post the EFI memory map and the 'ACPI: FOO
0xphysicaladdress' stuff at the top of the boot log?


efi_mem_reserve_persistent() has one caller for the GIC ITS stuff.

For the ITS, the reservations look like they are behind irqchip_init(), which is well
before the arch_initcall() that updates the resource tree from memblock. Your v1's first
patch should be sufficient.


> Again, this allows my Synquacer board to reliably use kexec_file_load
> with as little as 256M, something that would always fail before as it
> would overwrite most of the reserved tables.
> 
> Although this series still targets 5.14, the initial patch is a
> -stable candidate, and disables non-kdump uses of kexec_file_load. I
> have limited it to 5.10, as earlier kernels will require a different,
> probably more invasive approach.
> 
> Catalin, Ard: although this series has changed a bit compared to v1,
> I've kept your AB/RB tags. Should anything seem odd, please let me
> know and I'll drop them.


Thanks,

James


[0] I'm pretty sure this is enough. (Not tested)
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 4b7ee3fa9224..3ed45153ce7f 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -893,7 +893,7 @@ static int __init efi_memreserve_map_root(void)
        return 0;
 }

-static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
+static int __efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
 {
        struct resource *res, *parent;

@@ -911,6 +911,16 @@ static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
        return parent ? request_resource(parent, res) : 0;
 }

+static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
+{
+       int err = __efi_mem_reserve_iomem(addr, size);
+
+       if(IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK) && !err)
+               memblock_reserve(addr, size);
+
+       return err;
+}
+
 int __ref efi_mem_reserve_persistent(phys_addr_t addr, u64 size)
 {
        struct linux_efi_memreserve *rsv;

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations
@ 2021-06-04 16:20   ` James Morse
  0 siblings, 0 replies; 36+ messages in thread
From: James Morse @ 2021-06-04 16:20 UTC (permalink / raw)
  To: Marc Zyngier, kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Eric Biederman,
	Bhupesh SHARMA, AKASHI Takahiro, Dave Young, Andrew Morton,
	Moritz Fischer, kernel-team

Hi Marc,

On 31/05/2021 10:57, Marc Zyngier wrote:
> This series is a complete departure from the approach I initially sent
> almost a month ago[0]. Instead of trying to teach EFI, ACPI and other
> subsystem to use memblock, I've decided to stick with the iomem
> resource tree and use that exclusively for arm64.

> This means that my current approach is (despite what I initially
> replied to both Dave and Catalin) to provide an arm64-specific
> implementation of arch_kexec_locate_mem_hole() which walks the
> resource tree and excludes ranges of RAM that have been registered for
> any odd purpose. This is exactly what the userspace implementation
> does, and I don't really see a good reason to diverge from it.

Because in the ideal world we'd have only 'is it reserved' list to check against.
Memblock has been extended before. The resource-list is overly stringy, and I'm not sure
we can shove everything in the resource list.

Kexec already has problems on arm64 with memory hotplug. Fixing this for regular kexec in
/proc/iomem was rejected, and memblock's memblock_is_hotpluggable() is broken because
free_low_memory_core_early() does this:
|	memblock_clear_hotplug(0, -1)

Once that has been unpicked its clear kexec_file_load() can use
memblock_is_hotpluggable(). (its on the todo list, well, jira)


I'd prefer to keep kexec using memblock because it _shouldn't_ change after boot. Having
an "I want to reserve this and make it persistent over kexec" call that can happen at any
time can't work if the kexec image has already been loaded.
Practically, once user-space has started, you can't have new things you want to reserve
over kexec.


I don't see how the ACPI tables can escape short of a firmware bug. Could someone with an
affected platform boot with efi=debug and post the EFI memory map and the 'ACPI: FOO
0xphysicaladdress' stuff at the top of the boot log?


efi_mem_reserve_persistent() has one caller for the GIC ITS stuff.

For the ITS, the reservations look like they are behind irqchip_init(), which is well
before the arch_initcall() that updates the resource tree from memblock. Your v1's first
patch should be sufficient.


> Again, this allows my Synquacer board to reliably use kexec_file_load
> with as little as 256M, something that would always fail before as it
> would overwrite most of the reserved tables.
> 
> Although this series still targets 5.14, the initial patch is a
> -stable candidate, and disables non-kdump uses of kexec_file_load. I
> have limited it to 5.10, as earlier kernels will require a different,
> probably more invasive approach.
> 
> Catalin, Ard: although this series has changed a bit compared to v1,
> I've kept your AB/RB tags. Should anything seem odd, please let me
> know and I'll drop them.


Thanks,

James


[0] I'm pretty sure this is enough. (Not tested)
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 4b7ee3fa9224..3ed45153ce7f 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -893,7 +893,7 @@ static int __init efi_memreserve_map_root(void)
        return 0;
 }

-static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
+static int __efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
 {
        struct resource *res, *parent;

@@ -911,6 +911,16 @@ static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
        return parent ? request_resource(parent, res) : 0;
 }

+static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
+{
+       int err = __efi_mem_reserve_iomem(addr, size);
+
+       if(IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK) && !err)
+               memblock_reserve(addr, size);
+
+       return err;
+}
+
 int __ref efi_mem_reserve_persistent(phys_addr_t addr, u64 size)
 {
        struct linux_efi_memreserve *rsv;

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations
@ 2021-06-04 16:20   ` James Morse
  0 siblings, 0 replies; 36+ messages in thread
From: James Morse @ 2021-06-04 16:20 UTC (permalink / raw)
  To: Marc Zyngier, kexec, linux-arm-kernel, linux-kernel
  Cc: Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Eric Biederman,
	Bhupesh SHARMA, AKASHI Takahiro, Dave Young, Andrew Morton,
	Moritz Fischer, kernel-team

Hi Marc,

On 31/05/2021 10:57, Marc Zyngier wrote:
> This series is a complete departure from the approach I initially sent
> almost a month ago[0]. Instead of trying to teach EFI, ACPI and other
> subsystem to use memblock, I've decided to stick with the iomem
> resource tree and use that exclusively for arm64.

> This means that my current approach is (despite what I initially
> replied to both Dave and Catalin) to provide an arm64-specific
> implementation of arch_kexec_locate_mem_hole() which walks the
> resource tree and excludes ranges of RAM that have been registered for
> any odd purpose. This is exactly what the userspace implementation
> does, and I don't really see a good reason to diverge from it.

Because in the ideal world we'd have only 'is it reserved' list to check against.
Memblock has been extended before. The resource-list is overly stringy, and I'm not sure
we can shove everything in the resource list.

Kexec already has problems on arm64 with memory hotplug. Fixing this for regular kexec in
/proc/iomem was rejected, and memblock's memblock_is_hotpluggable() is broken because
free_low_memory_core_early() does this:
|	memblock_clear_hotplug(0, -1)

Once that has been unpicked its clear kexec_file_load() can use
memblock_is_hotpluggable(). (its on the todo list, well, jira)


I'd prefer to keep kexec using memblock because it _shouldn't_ change after boot. Having
an "I want to reserve this and make it persistent over kexec" call that can happen at any
time can't work if the kexec image has already been loaded.
Practically, once user-space has started, you can't have new things you want to reserve
over kexec.


I don't see how the ACPI tables can escape short of a firmware bug. Could someone with an
affected platform boot with efi=debug and post the EFI memory map and the 'ACPI: FOO
0xphysicaladdress' stuff at the top of the boot log?


efi_mem_reserve_persistent() has one caller for the GIC ITS stuff.

For the ITS, the reservations look like they are behind irqchip_init(), which is well
before the arch_initcall() that updates the resource tree from memblock. Your v1's first
patch should be sufficient.


> Again, this allows my Synquacer board to reliably use kexec_file_load
> with as little as 256M, something that would always fail before as it
> would overwrite most of the reserved tables.
> 
> Although this series still targets 5.14, the initial patch is a
> -stable candidate, and disables non-kdump uses of kexec_file_load. I
> have limited it to 5.10, as earlier kernels will require a different,
> probably more invasive approach.
> 
> Catalin, Ard: although this series has changed a bit compared to v1,
> I've kept your AB/RB tags. Should anything seem odd, please let me
> know and I'll drop them.


Thanks,

James


[0] I'm pretty sure this is enough. (Not tested)
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 4b7ee3fa9224..3ed45153ce7f 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -893,7 +893,7 @@ static int __init efi_memreserve_map_root(void)
        return 0;
 }

-static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
+static int __efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
 {
        struct resource *res, *parent;

@@ -911,6 +911,16 @@ static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
        return parent ? request_resource(parent, res) : 0;
 }

+static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
+{
+       int err = __efi_mem_reserve_iomem(addr, size);
+
+       if(IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK) && !err)
+               memblock_reserve(addr, size);
+
+       return err;
+}
+
 int __ref efi_mem_reserve_persistent(phys_addr_t addr, u64 size)
 {
        struct linux_efi_memreserve *rsv;

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations
  2021-06-04 16:20   ` James Morse
  (?)
@ 2021-06-09 22:39     ` Moritz Fischer
  -1 siblings, 0 replies; 36+ messages in thread
From: Moritz Fischer @ 2021-06-09 22:39 UTC (permalink / raw)
  To: James Morse
  Cc: Marc Zyngier, kexec, linux-arm-kernel, linux-kernel,
	Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Eric Biederman,
	Bhupesh SHARMA, AKASHI Takahiro, Dave Young, Andrew Morton,
	Moritz Fischer, kernel-team

Hi James, Marc

On Fri, Jun 04, 2021 at 05:20:38PM +0100, James Morse wrote:
> Hi Marc,
> 
> On 31/05/2021 10:57, Marc Zyngier wrote:
> > This series is a complete departure from the approach I initially sent
> > almost a month ago[0]. Instead of trying to teach EFI, ACPI and other
> > subsystem to use memblock, I've decided to stick with the iomem
> > resource tree and use that exclusively for arm64.
> 
> > This means that my current approach is (despite what I initially
> > replied to both Dave and Catalin) to provide an arm64-specific
> > implementation of arch_kexec_locate_mem_hole() which walks the
> > resource tree and excludes ranges of RAM that have been registered for
> > any odd purpose. This is exactly what the userspace implementation
> > does, and I don't really see a good reason to diverge from it.
> 
> Because in the ideal world we'd have only 'is it reserved' list to check against.
> Memblock has been extended before. The resource-list is overly stringy, and I'm not sure
> we can shove everything in the resource list.
> 
> Kexec already has problems on arm64 with memory hotplug. Fixing this for regular kexec in
> /proc/iomem was rejected, and memblock's memblock_is_hotpluggable() is broken because
> free_low_memory_core_early() does this:
> |	memblock_clear_hotplug(0, -1)
> 
> Once that has been unpicked its clear kexec_file_load() can use
> memblock_is_hotpluggable(). (its on the todo list, well, jira)
> 
> 
> I'd prefer to keep kexec using memblock because it _shouldn't_ change after boot. Having
> an "I want to reserve this and make it persistent over kexec" call that can happen at any
> time can't work if the kexec image has already been loaded.
> Practically, once user-space has started, you can't have new things you want to reserve
> over kexec.
> 
> 
> I don't see how the ACPI tables can escape short of a firmware bug. Could someone with an
> affected platform boot with efi=debug and post the EFI memory map and the 'ACPI: FOO
> 0xphysicaladdress' stuff at the top of the boot log?
> 
> 
> efi_mem_reserve_persistent() has one caller for the GIC ITS stuff.
> 
> For the ITS, the reservations look like they are behind irqchip_init(), which is well
> before the arch_initcall() that updates the resource tree from memblock. Your v1's first
> patch should be sufficient.
> 
> 
> > Again, this allows my Synquacer board to reliably use kexec_file_load
> > with as little as 256M, something that would always fail before as it
> > would overwrite most of the reserved tables.
> > 
> > Although this series still targets 5.14, the initial patch is a
> > -stable candidate, and disables non-kdump uses of kexec_file_load. I
> > have limited it to 5.10, as earlier kernels will require a different,
> > probably more invasive approach.
> > 
> > Catalin, Ard: although this series has changed a bit compared to v1,
> > I've kept your AB/RB tags. Should anything seem odd, please let me
> > know and I'll drop them.
> 
> 
> Thanks,
> 
> James
> 
> 
> [0] I'm pretty sure this is enough. (Not tested)
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index 4b7ee3fa9224..3ed45153ce7f 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -893,7 +893,7 @@ static int __init efi_memreserve_map_root(void)
>         return 0;
>  }
> 
> -static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
> +static int __efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
>  {
>         struct resource *res, *parent;
> 
> @@ -911,6 +911,16 @@ static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
>         return parent ? request_resource(parent, res) : 0;
>  }
> 
> +static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
> +{
> +       int err = __efi_mem_reserve_iomem(addr, size);
> +
> +       if(IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK) && !err)
> +               memblock_reserve(addr, size);
> +
> +       return err;
> +}
> +
>  int __ref efi_mem_reserve_persistent(phys_addr_t addr, u64 size)
>  {
>         struct linux_efi_memreserve *rsv;

Sorry for the long radio silence. Just got around to testing this.

I can confirm that the above change James proposed does work on the
platform that the issue was first observed on.

Cheers,
Moritz

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations
@ 2021-06-09 22:39     ` Moritz Fischer
  0 siblings, 0 replies; 36+ messages in thread
From: Moritz Fischer @ 2021-06-09 22:39 UTC (permalink / raw)
  To: James Morse
  Cc: Marc Zyngier, kexec, linux-arm-kernel, linux-kernel,
	Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Eric Biederman,
	Bhupesh SHARMA, AKASHI Takahiro, Dave Young, Andrew Morton,
	Moritz Fischer, kernel-team

Hi James, Marc

On Fri, Jun 04, 2021 at 05:20:38PM +0100, James Morse wrote:
> Hi Marc,
> 
> On 31/05/2021 10:57, Marc Zyngier wrote:
> > This series is a complete departure from the approach I initially sent
> > almost a month ago[0]. Instead of trying to teach EFI, ACPI and other
> > subsystem to use memblock, I've decided to stick with the iomem
> > resource tree and use that exclusively for arm64.
> 
> > This means that my current approach is (despite what I initially
> > replied to both Dave and Catalin) to provide an arm64-specific
> > implementation of arch_kexec_locate_mem_hole() which walks the
> > resource tree and excludes ranges of RAM that have been registered for
> > any odd purpose. This is exactly what the userspace implementation
> > does, and I don't really see a good reason to diverge from it.
> 
> Because in the ideal world we'd have only 'is it reserved' list to check against.
> Memblock has been extended before. The resource-list is overly stringy, and I'm not sure
> we can shove everything in the resource list.
> 
> Kexec already has problems on arm64 with memory hotplug. Fixing this for regular kexec in
> /proc/iomem was rejected, and memblock's memblock_is_hotpluggable() is broken because
> free_low_memory_core_early() does this:
> |	memblock_clear_hotplug(0, -1)
> 
> Once that has been unpicked its clear kexec_file_load() can use
> memblock_is_hotpluggable(). (its on the todo list, well, jira)
> 
> 
> I'd prefer to keep kexec using memblock because it _shouldn't_ change after boot. Having
> an "I want to reserve this and make it persistent over kexec" call that can happen at any
> time can't work if the kexec image has already been loaded.
> Practically, once user-space has started, you can't have new things you want to reserve
> over kexec.
> 
> 
> I don't see how the ACPI tables can escape short of a firmware bug. Could someone with an
> affected platform boot with efi=debug and post the EFI memory map and the 'ACPI: FOO
> 0xphysicaladdress' stuff at the top of the boot log?
> 
> 
> efi_mem_reserve_persistent() has one caller for the GIC ITS stuff.
> 
> For the ITS, the reservations look like they are behind irqchip_init(), which is well
> before the arch_initcall() that updates the resource tree from memblock. Your v1's first
> patch should be sufficient.
> 
> 
> > Again, this allows my Synquacer board to reliably use kexec_file_load
> > with as little as 256M, something that would always fail before as it
> > would overwrite most of the reserved tables.
> > 
> > Although this series still targets 5.14, the initial patch is a
> > -stable candidate, and disables non-kdump uses of kexec_file_load. I
> > have limited it to 5.10, as earlier kernels will require a different,
> > probably more invasive approach.
> > 
> > Catalin, Ard: although this series has changed a bit compared to v1,
> > I've kept your AB/RB tags. Should anything seem odd, please let me
> > know and I'll drop them.
> 
> 
> Thanks,
> 
> James
> 
> 
> [0] I'm pretty sure this is enough. (Not tested)
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index 4b7ee3fa9224..3ed45153ce7f 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -893,7 +893,7 @@ static int __init efi_memreserve_map_root(void)
>         return 0;
>  }
> 
> -static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
> +static int __efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
>  {
>         struct resource *res, *parent;
> 
> @@ -911,6 +911,16 @@ static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
>         return parent ? request_resource(parent, res) : 0;
>  }
> 
> +static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
> +{
> +       int err = __efi_mem_reserve_iomem(addr, size);
> +
> +       if(IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK) && !err)
> +               memblock_reserve(addr, size);
> +
> +       return err;
> +}
> +
>  int __ref efi_mem_reserve_persistent(phys_addr_t addr, u64 size)
>  {
>         struct linux_efi_memreserve *rsv;

Sorry for the long radio silence. Just got around to testing this.

I can confirm that the above change James proposed does work on the
platform that the issue was first observed on.

Cheers,
Moritz

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations
@ 2021-06-09 22:39     ` Moritz Fischer
  0 siblings, 0 replies; 36+ messages in thread
From: Moritz Fischer @ 2021-06-09 22:39 UTC (permalink / raw)
  To: James Morse
  Cc: Marc Zyngier, kexec, linux-arm-kernel, linux-kernel,
	Catalin Marinas, Will Deacon, Ard Biesheuvel, Mark Rutland,
	Lorenzo Pieralisi, Hanjun Guo, Sudeep Holla, Eric Biederman,
	Bhupesh SHARMA, AKASHI Takahiro, Dave Young, Andrew Morton,
	Moritz Fischer, kernel-team

Hi James, Marc

On Fri, Jun 04, 2021 at 05:20:38PM +0100, James Morse wrote:
> Hi Marc,
> 
> On 31/05/2021 10:57, Marc Zyngier wrote:
> > This series is a complete departure from the approach I initially sent
> > almost a month ago[0]. Instead of trying to teach EFI, ACPI and other
> > subsystem to use memblock, I've decided to stick with the iomem
> > resource tree and use that exclusively for arm64.
> 
> > This means that my current approach is (despite what I initially
> > replied to both Dave and Catalin) to provide an arm64-specific
> > implementation of arch_kexec_locate_mem_hole() which walks the
> > resource tree and excludes ranges of RAM that have been registered for
> > any odd purpose. This is exactly what the userspace implementation
> > does, and I don't really see a good reason to diverge from it.
> 
> Because in the ideal world we'd have only 'is it reserved' list to check against.
> Memblock has been extended before. The resource-list is overly stringy, and I'm not sure
> we can shove everything in the resource list.
> 
> Kexec already has problems on arm64 with memory hotplug. Fixing this for regular kexec in
> /proc/iomem was rejected, and memblock's memblock_is_hotpluggable() is broken because
> free_low_memory_core_early() does this:
> |	memblock_clear_hotplug(0, -1)
> 
> Once that has been unpicked its clear kexec_file_load() can use
> memblock_is_hotpluggable(). (its on the todo list, well, jira)
> 
> 
> I'd prefer to keep kexec using memblock because it _shouldn't_ change after boot. Having
> an "I want to reserve this and make it persistent over kexec" call that can happen at any
> time can't work if the kexec image has already been loaded.
> Practically, once user-space has started, you can't have new things you want to reserve
> over kexec.
> 
> 
> I don't see how the ACPI tables can escape short of a firmware bug. Could someone with an
> affected platform boot with efi=debug and post the EFI memory map and the 'ACPI: FOO
> 0xphysicaladdress' stuff at the top of the boot log?
> 
> 
> efi_mem_reserve_persistent() has one caller for the GIC ITS stuff.
> 
> For the ITS, the reservations look like they are behind irqchip_init(), which is well
> before the arch_initcall() that updates the resource tree from memblock. Your v1's first
> patch should be sufficient.
> 
> 
> > Again, this allows my Synquacer board to reliably use kexec_file_load
> > with as little as 256M, something that would always fail before as it
> > would overwrite most of the reserved tables.
> > 
> > Although this series still targets 5.14, the initial patch is a
> > -stable candidate, and disables non-kdump uses of kexec_file_load. I
> > have limited it to 5.10, as earlier kernels will require a different,
> > probably more invasive approach.
> > 
> > Catalin, Ard: although this series has changed a bit compared to v1,
> > I've kept your AB/RB tags. Should anything seem odd, please let me
> > know and I'll drop them.
> 
> 
> Thanks,
> 
> James
> 
> 
> [0] I'm pretty sure this is enough. (Not tested)
> diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
> index 4b7ee3fa9224..3ed45153ce7f 100644
> --- a/drivers/firmware/efi/efi.c
> +++ b/drivers/firmware/efi/efi.c
> @@ -893,7 +893,7 @@ static int __init efi_memreserve_map_root(void)
>         return 0;
>  }
> 
> -static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
> +static int __efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
>  {
>         struct resource *res, *parent;
> 
> @@ -911,6 +911,16 @@ static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
>         return parent ? request_resource(parent, res) : 0;
>  }
> 
> +static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size)
> +{
> +       int err = __efi_mem_reserve_iomem(addr, size);
> +
> +       if(IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK) && !err)
> +               memblock_reserve(addr, size);
> +
> +       return err;
> +}
> +
>  int __ref efi_mem_reserve_persistent(phys_addr_t addr, u64 size)
>  {
>         struct linux_efi_memreserve *rsv;

Sorry for the long radio silence. Just got around to testing this.

I can confirm that the above change James proposed does work on the
platform that the issue was first observed on.

Cheers,
Moritz

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2021-06-09 22:44 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-31  9:57 [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations Marc Zyngier
2021-05-31  9:57 ` Marc Zyngier
2021-05-31  9:57 ` Marc Zyngier
2021-05-31  9:57 ` [PATCH v2 1/5] arm64: kexec_file: Forbid non-crash kernels Marc Zyngier
2021-05-31  9:57   ` Marc Zyngier
2021-05-31  9:57   ` Marc Zyngier
2021-05-31 19:37   ` Ard Biesheuvel
2021-05-31 19:37     ` Ard Biesheuvel
2021-05-31 19:37     ` Ard Biesheuvel
2021-06-01  8:36     ` Marc Zyngier
2021-06-01  8:36       ` Marc Zyngier
2021-06-01  8:36       ` Marc Zyngier
2021-06-04 16:20   ` James Morse
2021-06-04 16:20     ` James Morse
2021-06-04 16:20     ` James Morse
2021-05-31  9:57 ` [PATCH v2 2/5] kexec_file: Make locate_mem_hole_callback global Marc Zyngier
2021-05-31  9:57   ` Marc Zyngier
2021-05-31  9:57   ` Marc Zyngier
2021-05-31  9:57 ` [PATCH v2 3/5] kernel/resource: Allow find_next_iomem_res() to exclude overlapping child resources Marc Zyngier
2021-05-31  9:57   ` Marc Zyngier
2021-05-31  9:57   ` Marc Zyngier
2021-05-31  9:57 ` [PATCH v2 4/5] kernel/resource: Introduce walk_system_ram_excluding_child_res() Marc Zyngier
2021-05-31  9:57   ` Marc Zyngier
2021-05-31  9:57   ` Marc Zyngier
2021-05-31  9:57 ` [PATCH v2 5/5] arm64: kexec_image: Restore full kexec functionnality Marc Zyngier
2021-05-31  9:57   ` Marc Zyngier
2021-05-31  9:57   ` Marc Zyngier
2021-05-31 19:36 ` [PATCH v2 0/5] arm64: Make kexec_file_load honor iomem reservations Ard Biesheuvel
2021-05-31 19:36   ` Ard Biesheuvel
2021-05-31 19:36   ` Ard Biesheuvel
2021-06-04 16:20 ` James Morse
2021-06-04 16:20   ` James Morse
2021-06-04 16:20   ` James Morse
2021-06-09 22:39   ` Moritz Fischer
2021-06-09 22:39     ` Moritz Fischer
2021-06-09 22:39     ` Moritz Fischer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.