linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] kdump/x86: crashkernel=X try to reserve below 896M first then below 4G and MAXMEM
@ 2018-04-27  9:00 dyoung
  2018-04-27  9:00 ` [PATCH 2/2] kdump: round up the total memory size to 128M for crashkernel reservation dyoung
  2018-04-27  9:14 ` [PATCH 1/2] kdump/x86: crashkernel=X try to reserve below 896M first then below 4G and MAXMEM Dave Young
  0 siblings, 2 replies; 6+ messages in thread
From: dyoung @ 2018-04-27  9:00 UTC (permalink / raw)
  To: kexec, linux-kernel; +Cc: bhe, yinghai, akpm, dyoung, vgoyal

[-- Attachment #1: x86-kdump-crashkernel-X-try-to-reserve-below-896M-fi.patch --]
[-- Type: text/plain, Size: 2057 bytes --]

Now crashkernel=X will fail if there's not enough memory at low region
(below 896M) when trying to reserve large memory size.  One can use
crashkernel=xM,high to reserve it at high region (>4G) but it is more
convinient to improve crashkernel=X to:

 - First try to reserve X below 896M (for being compatible with old
   kexec-tools).
 - If fails, try to reserve X below 4G (swiotlb need to stay below 4G).
 - If fails, try to reserve X from MAXMEM top down.

It's more transparent and user-friendly.

If crashkernel is large and the reserved is beyond 896M, old kexec-tools
is not compatible with new kernel because old kexec-tools can not load
kernel at high memory region, there was an old discussion below:
https://lkml.org/lkml/2013/10/15/601

But actually the behavior is consistent during my test. Suppose
old kernel fail to reserve memory at low areas, kdump does not
work because no meory reserved. With this patch, suppose new kernel
successfully reserved memory at high areas, old kexec-tools still fail
to load kdump kernel (tested 2.0.2), so it is acceptable, no need to
worry about the compatibility.

Signed-off-by: Dave Young <dyoung@redhat.com>
---
 arch/x86/kernel/setup.c |   16 ++++++++++++++++
 1 file changed, 16 insertions(+)

--- linux-x86.orig/arch/x86/kernel/setup.c
+++ linux-x86/arch/x86/kernel/setup.c
@@ -545,6 +545,22 @@ static void __init reserve_crashkernel(v
 						    high ? CRASH_ADDR_HIGH_MAX
 							 : CRASH_ADDR_LOW_MAX,
 						    crash_size, CRASH_ALIGN);
+#ifdef CONFIG_X86_64
+		/*
+		 * crashkernel=X reserve below 896M fails? Try below 4G
+		 */
+		if (!high && !crash_base)
+			crash_base = memblock_find_in_range(CRASH_ALIGN,
+						(1ULL << 32),
+						crash_size, CRASH_ALIGN);
+		/*
+		 * crashkernel=X reserve below 4G fails? Try MAXMEM
+		 */
+		if (!high && !crash_base)
+			crash_base = memblock_find_in_range(CRASH_ALIGN,
+						CRASH_ADDR_HIGH_MAX,
+						crash_size, CRASH_ALIGN);
+#endif
 		if (!crash_base) {
 			pr_info("crashkernel reservation failed - No suitable area found.\n");
 			return;

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 2/2] kdump: round up the total memory size to 128M for crashkernel reservation
  2018-04-27  9:00 [PATCH 1/2] kdump/x86: crashkernel=X try to reserve below 896M first then below 4G and MAXMEM dyoung
@ 2018-04-27  9:00 ` dyoung
  2018-04-27  9:14 ` [PATCH 1/2] kdump/x86: crashkernel=X try to reserve below 896M first then below 4G and MAXMEM Dave Young
  1 sibling, 0 replies; 6+ messages in thread
From: dyoung @ 2018-04-27  9:00 UTC (permalink / raw)
  To: kexec, linux-kernel; +Cc: bhe, yinghai, akpm, dyoung, vgoyal

[-- Attachment #1: kdump-crashkernel-roundup-total-mem.patch --]
[-- Type: text/plain, Size: 2184 bytes --]

The total memory size we get in kernel is usually slightly less than 2G with a
2G memory module machine. The main reason is bios/firmware reserve some area
it will not export all memory as usable to Linux.

2G memory X86 kvm guest test result of the total_mem value:
UEFI boot with ovmf: 0x7ef10000
Legacy boot kvm guest: 0x7ff7cc00
This is also a problem on arm64 UEFI booted system according to my test.

Thus for example crashkernel=1G-2G:128M,  if we have a 1G memory
machine, we get total size 1023M from firmware then it will not fall
into 1G-2G thus no memory reserved.  User will never know that, it is
hard to let user to know the exact total value we get in kernel

An option is to use dmi/smbios to get physical memory size, but it's not
reliable as well. According to Prarit hardware vendors sometimes screw this up.
Thus round up total size to 128M to workaround this problem.

Signed-off-by: Dave Young <dyoung@redhat.com>
---
 kernel/crash_core.c |   14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

--- linux-x86.orig/kernel/crash_core.c
+++ linux-x86/kernel/crash_core.c
@@ -9,6 +9,7 @@
 #include <linux/crash_core.h>
 #include <linux/utsname.h>
 #include <linux/vmalloc.h>
+#include <linux/sizes.h>
 
 #include <asm/page.h>
 #include <asm/sections.h>
@@ -41,6 +42,15 @@ static int __init parse_crashkernel_mem(
 					unsigned long long *crash_base)
 {
 	char *cur = cmdline, *tmp;
+	unsigned long long total_mem = system_ram;
+
+	/*
+	 * Firmware sometimes reserves some memory regions for it's own use.
+	 * so we get less than actual system memory size.
+	 * Workaround this by round up the total size to 128M which is
+	 * enough for most test cases.
+	 */
+	total_mem = roundup(total_mem, SZ_128M);
 
 	/* for each entry of the comma-separated list */
 	do {
@@ -85,13 +95,13 @@ static int __init parse_crashkernel_mem(
 			return -EINVAL;
 		}
 		cur = tmp;
-		if (size >= system_ram) {
+		if (size >= total_mem) {
 			pr_warn("crashkernel: invalid size\n");
 			return -EINVAL;
 		}
 
 		/* match ? */
-		if (system_ram >= start && system_ram < end) {
+		if (total_mem >= start && total_mem < end) {
 			*crash_size = size;
 			break;
 		}

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] kdump/x86: crashkernel=X try to reserve below 896M first then below 4G and MAXMEM
  2018-04-27  9:00 [PATCH 1/2] kdump/x86: crashkernel=X try to reserve below 896M first then below 4G and MAXMEM dyoung
  2018-04-27  9:00 ` [PATCH 2/2] kdump: round up the total memory size to 128M for crashkernel reservation dyoung
@ 2018-04-27  9:14 ` Dave Young
  2018-04-27 23:28   ` Baoquan He
  2018-05-07  2:48   ` Dave Young
  1 sibling, 2 replies; 6+ messages in thread
From: Dave Young @ 2018-04-27  9:14 UTC (permalink / raw)
  To: kexec, linux-kernel; +Cc: bhe, yinghai, akpm, vgoyal

Hi,
 
This is a resend of below patches:
http://lists.infradead.org/pipermail/kexec/2017-October/019569.html
 
I dropped the original patch 1 since Baoquan is not happy with it.
For patch 2 (the 1st patch in this series), there is some improvement
comment from Baoquan to create some generic memblock iteration function.
But nobody has time to work on it for the time being.  According to
offline discussion with him.  That can be done in the future if someone
is interested.  We can go with the current kdump only fixes.
 
Other than above,  the patches are just same.
 
Thanks
Dave

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] kdump/x86: crashkernel=X try to reserve below 896M first then below 4G and MAXMEM
  2018-04-27  9:14 ` [PATCH 1/2] kdump/x86: crashkernel=X try to reserve below 896M first then below 4G and MAXMEM Dave Young
@ 2018-04-27 23:28   ` Baoquan He
  2018-05-07  2:48   ` Dave Young
  1 sibling, 0 replies; 6+ messages in thread
From: Baoquan He @ 2018-04-27 23:28 UTC (permalink / raw)
  To: Dave Young; +Cc: kexec, linux-kernel, akpm, yinghai, vgoyal

On 04/27/18 at 05:14pm, Dave Young wrote:
> Hi,
>  
> This is a resend of below patches:
> http://lists.infradead.org/pipermail/kexec/2017-October/019569.html
>  
> I dropped the original patch 1 since Baoquan is not happy with it.
> For patch 2 (the 1st patch in this series), there is some improvement
> comment from Baoquan to create some generic memblock iteration function.
> But nobody has time to work on it for the time being.  According to
> offline discussion with him.  That can be done in the future if someone
> is interested.  We can go with the current kdump only fixes.
>  
> Other than above,  the patches are just same.

Thanks to work on this. Looks good to me.

ACK

Acked-by: Baoquan He <bhe@redhat.com>

Thanks
Baoquan

> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] kdump/x86: crashkernel=X try to reserve below 896M first then below 4G and MAXMEM
  2018-04-27  9:14 ` [PATCH 1/2] kdump/x86: crashkernel=X try to reserve below 896M first then below 4G and MAXMEM Dave Young
  2018-04-27 23:28   ` Baoquan He
@ 2018-05-07  2:48   ` Dave Young
  2018-10-31  6:16     ` Pingfan Liu
  1 sibling, 1 reply; 6+ messages in thread
From: Dave Young @ 2018-05-07  2:48 UTC (permalink / raw)
  To: akpm; +Cc: bhe, yinghai, akpm, vgoyal, kexec, linux-kernel

On 04/27/18 at 05:14pm, Dave Young wrote:
> Hi,
>  
> This is a resend of below patches:
> http://lists.infradead.org/pipermail/kexec/2017-October/019569.html
>  
> I dropped the original patch 1 since Baoquan is not happy with it.
> For patch 2 (the 1st patch in this series), there is some improvement
> comment from Baoquan to create some generic memblock iteration function.
> But nobody has time to work on it for the time being.  According to
> offline discussion with him.  That can be done in the future if someone
> is interested.  We can go with the current kdump only fixes.
>  
> Other than above,  the patches are just same.

Hi Andrew, do you have concerns about the patches?  It has been used for
long time in Red Hat kernel, since people do not object them, could you
pick them if no other concerns?

Thanks
Dave

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] kdump/x86: crashkernel=X try to reserve below 896M first then below 4G and MAXMEM
  2018-05-07  2:48   ` Dave Young
@ 2018-10-31  6:16     ` Pingfan Liu
  0 siblings, 0 replies; 6+ messages in thread
From: Pingfan Liu @ 2018-10-31  6:16 UTC (permalink / raw)
  To: Dave Young
  Cc: Andrew Morton, Baoquan He, yinghai, vgoyal, kexec, linux-kernel

Hi, I encounter a case where crashkernel=384M, and kaslr is enabled.
During the test, sometimes, the system may fail to reserve region for
crash kernel, although there is much free space above 896MB. It is
caused by the truncation of the candidate region by kaslr kernel. It
raises confusion to the end user that sometimes crashkernel=X works
while sometimes fails.
So can we have this patch to fix the issue?

Thanks,
Pingfan
On Mon, May 7, 2018 at 10:49 AM Dave Young <dyoung@redhat.com> wrote:
>
> On 04/27/18 at 05:14pm, Dave Young wrote:
> > Hi,
> >
> > This is a resend of below patches:
> > http://lists.infradead.org/pipermail/kexec/2017-October/019569.html
> >
> > I dropped the original patch 1 since Baoquan is not happy with it.
> > For patch 2 (the 1st patch in this series), there is some improvement
> > comment from Baoquan to create some generic memblock iteration function.
> > But nobody has time to work on it for the time being.  According to
> > offline discussion with him.  That can be done in the future if someone
> > is interested.  We can go with the current kdump only fixes.
> >
> > Other than above,  the patches are just same.
>
> Hi Andrew, do you have concerns about the patches?  It has been used for
> long time in Red Hat kernel, since people do not object them, could you
> pick them if no other concerns?
>
> Thanks
> Dave

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-10-31  6:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-27  9:00 [PATCH 1/2] kdump/x86: crashkernel=X try to reserve below 896M first then below 4G and MAXMEM dyoung
2018-04-27  9:00 ` [PATCH 2/2] kdump: round up the total memory size to 128M for crashkernel reservation dyoung
2018-04-27  9:14 ` [PATCH 1/2] kdump/x86: crashkernel=X try to reserve below 896M first then below 4G and MAXMEM Dave Young
2018-04-27 23:28   ` Baoquan He
2018-05-07  2:48   ` Dave Young
2018-10-31  6:16     ` Pingfan Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).