All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Fix early boot OOM issues for some platforms
@ 2021-04-06 14:11 Hongyan Xia
  2021-04-06 14:11 ` [PATCH 1/2] Fix where the real mode interrupt vector ends Hongyan Xia
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Hongyan Xia @ 2021-04-06 14:11 UTC (permalink / raw)
  To: kexec; +Cc: horms, raphning, jgrall, hx242

From: Hongyan Xia <hongyxia@amazon.com>

We have observed a couple of cases where after a successful kexec, the
crash kernel loaded in the 2nd kernel will run out of memory and
crash. We narrowed down to two issues:

1. when preparing the memory map, kexec excludes the Interrupt Vector
   Table. However, the end address of IVT is incorrect.
2. The wrong end address of IVT is not 1KiB aligned. When preparing the
   crashkernel, the memory map will reject unaligned memory chunks. On
   many x86 platforms this means the entire bottom 1MiB range is
   excluded from the crashkernel memory map, resulting in OOM when the
   crashkernel boots.

Patch 1 fixes 1 which is actually enough to eliminate the issue but we
feel that such issue may happen again (e.g., with a weird BIOS that has
unaligned e820 map), so we also have patch 2 to improve the handling of
unaligned memory.

Hongyan Xia (2):
  Fix where the real mode interrupt vector ends
  Shrink segments to fit alignment instead of throwing them away

 kexec/arch/i386/crashdump-x86.c    | 15 ++++++++++++---
 kexec/arch/i386/kexec-x86-common.c | 10 ++++++++--
 2 files changed, 20 insertions(+), 5 deletions(-)

-- 
2.23.3


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/2] Fix where the real mode interrupt vector ends
  2021-04-06 14:11 [PATCH 0/2] Fix early boot OOM issues for some platforms Hongyan Xia
@ 2021-04-06 14:11 ` Hongyan Xia
  2021-04-06 14:11 ` [PATCH 2/2] Shrink segments to fit alignment instead of throwing them away Hongyan Xia
  2021-04-07 19:28 ` [PATCH 0/2] Fix early boot OOM issues for some platforms Simon Horman
  2 siblings, 0 replies; 4+ messages in thread
From: Hongyan Xia @ 2021-04-06 14:11 UTC (permalink / raw)
  To: kexec; +Cc: horms, raphning, jgrall, hx242

From: Hongyan Xia <hongyxia@amazon.com>

The real mode ends at 0x400, not 0x100. The code intentionally excludes
the IVT as RAM, so use the correct address.

Also, 0x100 is not 1K aligned and will be rejected by add_memmap(). We
have observed problems that after a multiboot2 kexec, the next kexec
will throw away such unaligned chunks, losing memory for the next next
kernel. In some corner cases, such loss of memory can actually cause OOM
during boot.

Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 kexec/arch/i386/kexec-x86-common.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/kexec/arch/i386/kexec-x86-common.c b/kexec/arch/i386/kexec-x86-common.c
index 9303704a0714..ffc95a9e43f8 100644
--- a/kexec/arch/i386/kexec-x86-common.c
+++ b/kexec/arch/i386/kexec-x86-common.c
@@ -48,6 +48,12 @@
 #define E820_PRAM         12
 #endif
 
+/*
+ * The real mode IVT ends at 0x400.
+ * See https://wiki.osdev.org/Interrupt_Vector_Table.
+ */
+#define REALMODE_IVT_END 0x400
+
 static struct memory_range memory_range[MAX_MEMORY_RANGES];
 
 /**
@@ -360,8 +366,8 @@ int get_memory_ranges(struct memory_range **range, int *ranges,
 	/* Don't report the interrupt table as ram */
 	for (i = 0; i < *ranges; i++) {
 		if ((*range)[i].type == RANGE_RAM &&
-				((*range)[i].start < 0x100)) {
-			(*range)[i].start = 0x100;
+				((*range)[i].start < REALMODE_IVT_END)) {
+			(*range)[i].start = REALMODE_IVT_END;
 			break;
 		}
 	}
-- 
2.23.3


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/2] Shrink segments to fit alignment instead of throwing them away
  2021-04-06 14:11 [PATCH 0/2] Fix early boot OOM issues for some platforms Hongyan Xia
  2021-04-06 14:11 ` [PATCH 1/2] Fix where the real mode interrupt vector ends Hongyan Xia
@ 2021-04-06 14:11 ` Hongyan Xia
  2021-04-07 19:28 ` [PATCH 0/2] Fix early boot OOM issues for some platforms Simon Horman
  2 siblings, 0 replies; 4+ messages in thread
From: Hongyan Xia @ 2021-04-06 14:11 UTC (permalink / raw)
  To: kexec; +Cc: horms, raphning, jgrall, hx242

From: Hongyan Xia <hongyxia@amazon.com>

We risk throwing an entire large chunk away if it is just slightly
unaligned which then causes the crash kernel to run out of RAM. Keep
them and shrink them to alignment.

Signed-off-by: Hongyan Xia <hongyxia@amazon.com>
---
 kexec/arch/i386/crashdump-x86.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/kexec/arch/i386/crashdump-x86.c b/kexec/arch/i386/crashdump-x86.c
index c79791f2b3e0..3fcb19ad76d6 100644
--- a/kexec/arch/i386/crashdump-x86.c
+++ b/kexec/arch/i386/crashdump-x86.c
@@ -475,9 +475,18 @@ static int add_memmap(struct memory_range *memmap_p, int *nr_memmap,
 	int i, j, nr_entries = 0, tidx = 0, align = 1024;
 	unsigned long long mstart, mend;
 
-	/* Do alignment check if it's RANGE_RAM */
-	if ((type == RANGE_RAM) && ((addr%align) || (size%align)))
-		return -1;
+	/* Shrink to 1KiB alignment if needed. */
+	if (type == RANGE_RAM && ((addr%align) || (size%align))) {
+		unsigned long long end = addr + size;
+
+		printf("%s: RAM chunk %#llx - %#llx unaligned\n", __func__, addr, end);
+		addr = _ALIGN_UP(addr, align);
+		end = _ALIGN_DOWN(end, align);
+		if (addr >= end)
+			return -1;
+		size = end - addr;
+		printf("%s: RAM chunk shrunk to %#llx - %#llx\n", __func__, addr, end);
+	}
 
 	/* Make sure at least one entry in list is free. */
 	for (i = 0; i < CRASH_MAX_MEMMAP_NR;  i++) {
-- 
2.23.3


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH 0/2] Fix early boot OOM issues for some platforms
  2021-04-06 14:11 [PATCH 0/2] Fix early boot OOM issues for some platforms Hongyan Xia
  2021-04-06 14:11 ` [PATCH 1/2] Fix where the real mode interrupt vector ends Hongyan Xia
  2021-04-06 14:11 ` [PATCH 2/2] Shrink segments to fit alignment instead of throwing them away Hongyan Xia
@ 2021-04-07 19:28 ` Simon Horman
  2 siblings, 0 replies; 4+ messages in thread
From: Simon Horman @ 2021-04-07 19:28 UTC (permalink / raw)
  To: Hongyan Xia; +Cc: kexec, raphning, jgrall

On Tue, Apr 06, 2021 at 03:11:51PM +0100, Hongyan Xia wrote:
> From: Hongyan Xia <hongyxia@amazon.com>
> 
> We have observed a couple of cases where after a successful kexec, the
> crash kernel loaded in the 2nd kernel will run out of memory and
> crash. We narrowed down to two issues:
> 
> 1. when preparing the memory map, kexec excludes the Interrupt Vector
>    Table. However, the end address of IVT is incorrect.
> 2. The wrong end address of IVT is not 1KiB aligned. When preparing the
>    crashkernel, the memory map will reject unaligned memory chunks. On
>    many x86 platforms this means the entire bottom 1MiB range is
>    excluded from the crashkernel memory map, resulting in OOM when the
>    crashkernel boots.
> 
> Patch 1 fixes 1 which is actually enough to eliminate the issue but we
> feel that such issue may happen again (e.g., with a weird BIOS that has
> unaligned e820 map), so we also have patch 2 to improve the handling of
> unaligned memory.
> 
> Hongyan Xia (2):
>   Fix where the real mode interrupt vector ends
>   Shrink segments to fit alignment instead of throwing them away

Thanks, series applied.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-04-07 19:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-06 14:11 [PATCH 0/2] Fix early boot OOM issues for some platforms Hongyan Xia
2021-04-06 14:11 ` [PATCH 1/2] Fix where the real mode interrupt vector ends Hongyan Xia
2021-04-06 14:11 ` [PATCH 2/2] Shrink segments to fit alignment instead of throwing them away Hongyan Xia
2021-04-07 19:28 ` [PATCH 0/2] Fix early boot OOM issues for some platforms Simon Horman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.