All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Make use of new memmap= kernel parameter syntax
@ 2013-01-22 15:02 Thomas Renninger
  2013-01-22 15:02 ` [PATCH 1/3] kexec: Split kernel_version() to also be able to pass a release string Thomas Renninger
                   ` (3 more replies)
  0 siblings, 4 replies; 35+ messages in thread
From: Thomas Renninger @ 2013-01-22 15:02 UTC (permalink / raw)
  To: horms; +Cc: yinghai, x86, kexec, vgoyal, hpa

Details were discussed on quite some lists (kexec, lkml, x86, etc)
in a kernel thread with subject:
[PATCH] x86 e820: only void usable memory areas in memmap=exactmap case

The patch that memmap= can take several arguments is already queued
in linux-x86-tip in the mm2 branch.

There have been no objections on the introduction to memmap=resetusablemap
for several days after discussing things in detail. The repost patches
(will do that now) propably will show up in x86-tip mainline kernel
tree pretty soon as well.
Target for this feature to show up in Linus' tree is 3.9-rc1.
Therefore these patches check for 3.9.0 and newer kernels. If such a
kernel is tried to get loaded via kexec -p, the new memmap= syntax is used.

Please consider to apply these to the kexec-tools repository.

Thanks,

    Thomas


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 1/3] kexec: Split kernel_version() to also be able to pass a release string
  2013-01-22 15:02 [PATCH 0/3] Make use of new memmap= kernel parameter syntax Thomas Renninger
@ 2013-01-22 15:02 ` Thomas Renninger
  2013-01-22 15:02 ` [PATCH 2/3] kexec x86: Extract kernel version and convert it to KERNEL_VERSION() style Thomas Renninger
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 35+ messages in thread
From: Thomas Renninger @ 2013-01-22 15:02 UTC (permalink / raw)
  To: horms; +Cc: x86, kexec, hpa, yinghai, Thomas Renninger, vgoyal

No functional change, later patches need this.

Signed-off-by: Thomas Renninger <trenn@suse.de>
---
 kexec/arch/i386/crashdump-x86.c |    2 +-
 kexec/kernel_version.c          |   33 +++++++++++++++++++--------------
 kexec/kexec.h                   |    3 ++-
 3 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/kexec/arch/i386/crashdump-x86.c b/kexec/arch/i386/crashdump-x86.c
index 245402c..a900b03 100644
--- a/kexec/arch/i386/crashdump-x86.c
+++ b/kexec/arch/i386/crashdump-x86.c
@@ -63,7 +63,7 @@ static int get_kernel_page_offset(struct kexec_info *UNUSED(info),
 	int kv;
 
 	if (elf_info->machine == EM_X86_64) {
-		kv = kernel_version();
+		kv = kernel_version_running();
 		if (kv < 0)
 			return -1;
 
diff --git a/kexec/kernel_version.c b/kexec/kernel_version.c
index 079312b..e27a0b7 100644
--- a/kexec/kernel_version.c
+++ b/kexec/kernel_version.c
@@ -6,18 +6,15 @@
 #include <limits.h>
 #include <stdlib.h>
 
-long kernel_version(void)
+#define unsupported_release(str) \
+	fprintf(stderr, "Unsupported release string: %s\n", str);
+
+long kernel_version(char *release_str)
 {
-	struct utsname utsname;
 	unsigned long major, minor, patch;
 	char *p;
 
-	if (uname(&utsname) < 0) {
-		fprintf(stderr, "uname failed: %s\n", strerror(errno));
-		return -1;
-	}
-
-	p = utsname.release;
+	p = release_str;
 	major = strtoul(p, &p, 10);
 	if (major == ULONG_MAX) {
 		fprintf(stderr, "strtoul failed: %s\n", strerror(errno));
@@ -25,8 +22,7 @@ long kernel_version(void)
 	}
 
 	if (*p++ != '.') {
-		fprintf(stderr, "Unsupported utsname.release: %s\n",
-			utsname.release);
+		unsupported_release(release_str);
 		return -1;
 	}
 
@@ -37,8 +33,7 @@ long kernel_version(void)
 	}
 
 	if (*p++ != '.') {
-		fprintf(stderr, "Unsupported utsname.release: %s\n",
-			utsname.release);
+		unsupported_release(release_str);
 		return -1;
 	}
 
@@ -49,10 +44,20 @@ long kernel_version(void)
 	}
 
 	if (major >= 256 || minor >= 256 || patch >= 256) {
-		fprintf(stderr, "Unsupported utsname.release: %s\n",
-			utsname.release);
+		unsupported_release(release_str);
 		return -1;
 	}
 
 	return KERNEL_VERSION(major, minor, patch);
 }
+
+long kernel_version_running(void)
+{
+	struct utsname utsname;
+
+	if (uname(&utsname) < 0) {
+		fprintf(stderr, "uname failed: %s\n", strerror(errno));
+		return -1;
+	}
+	return kernel_version(utsname.release);
+}
diff --git a/kexec/kexec.h b/kexec/kexec.h
index 94c62c1..ecc75d9 100644
--- a/kexec/kexec.h
+++ b/kexec/kexec.h
@@ -154,7 +154,8 @@ long physical_arch(void);
 
 #define KERNEL_VERSION(major, minor, patch) \
 	(((major) << 16) | ((minor) << 8) | patch)
-long kernel_version(void);
+long kernel_version(char *release);
+long kernel_version_running(void);
 
 void usage(void);
 int get_memory_ranges(struct memory_range **range, int *ranges,
-- 
1.7.6.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 2/3] kexec x86: Extract kernel version and convert it to KERNEL_VERSION() style
  2013-01-22 15:02 [PATCH 0/3] Make use of new memmap= kernel parameter syntax Thomas Renninger
  2013-01-22 15:02 ` [PATCH 1/3] kexec: Split kernel_version() to also be able to pass a release string Thomas Renninger
@ 2013-01-22 15:02 ` Thomas Renninger
  2013-01-22 15:02 ` [PATCH 3/3] kexec x86: Make kexec aware of new memmap= kernel parameter possibilities Thomas Renninger
  2013-01-30  4:31 ` [PATCH 0/3] Make use of new memmap= kernel parameter syntax Simon Horman
  3 siblings, 0 replies; 35+ messages in thread
From: Thomas Renninger @ 2013-01-22 15:02 UTC (permalink / raw)
  To: horms; +Cc: x86, kexec, hpa, yinghai, Thomas Renninger, vgoyal

from the kernel which is going to be loaded.
This is needed if kexec wants to check for available features by kernel release
version.

Signed-off-by: Thomas Renninger <trenn@suse.de>
---
 kexec/arch/i386/kexec-bzImage.c |   13 +++++++++++--
 kexec/arch/i386/kexec-x86.h     |    1 +
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/kexec/arch/i386/kexec-bzImage.c b/kexec/arch/i386/kexec-bzImage.c
index 0605909..0c8c6bc 100644
--- a/kexec/arch/i386/kexec-bzImage.c
+++ b/kexec/arch/i386/kexec-bzImage.c
@@ -40,6 +40,7 @@
 #include <arch/options.h>
 
 static const int probe_debug = 0;
+long kv_to_load;
 
 int bzImage_probe(const char *buf, off_t len)
 {
@@ -107,7 +108,7 @@ int do_bzImage_load(struct kexec_info *info,
 	struct x86_linux_header setup_header;
 	struct x86_linux_param_header *real_mode;
 	int setup_sects;
-	char *kernel_version;
+	char kernel_release[12];
 	size_t size;
 	int kern16_size;
 	unsigned long setup_base, setup_size;
@@ -131,7 +132,15 @@ int do_bzImage_load(struct kexec_info *info,
 	}
 
 	kern16_size = (setup_sects +1) *512;
-	kernel_version = ((char *)&setup_header) + 512 + setup_header.kver_addr;
+
+	memcpy(kernel_release, kernel +  setup_header.kver_addr + 512, 12);
+	kernel_release[11] = '\0';
+	kv_to_load = kernel_version(kernel_release);
+	if (kv_to_load < 0)
+		die("Invalid kernel version\n");
+	dbgprintf("Kernel release: %s in long format: 0x%lx\n",
+		  kernel_release, kv_to_load);
+
 	if (kernel_len < kern16_size) {
 		fprintf(stderr, "BzImage truncated?\n");
 		return -1;
diff --git a/kexec/arch/i386/kexec-x86.h b/kexec/arch/i386/kexec-x86.h
index 5aa2a46..db80879 100644
--- a/kexec/arch/i386/kexec-x86.h
+++ b/kexec/arch/i386/kexec-x86.h
@@ -11,6 +11,7 @@ enum coretype {
 
 extern unsigned char compat_x86_64[];
 extern uint32_t compat_x86_64_size, compat_x86_64_entry32;
+extern long kv_to_load;
 
 struct entry32_regs {
 	uint32_t eax;
-- 
1.7.6.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 3/3] kexec x86: Make kexec aware of new memmap= kernel parameter possibilities
  2013-01-22 15:02 [PATCH 0/3] Make use of new memmap= kernel parameter syntax Thomas Renninger
  2013-01-22 15:02 ` [PATCH 1/3] kexec: Split kernel_version() to also be able to pass a release string Thomas Renninger
  2013-01-22 15:02 ` [PATCH 2/3] kexec x86: Extract kernel version and convert it to KERNEL_VERSION() style Thomas Renninger
@ 2013-01-22 15:02 ` Thomas Renninger
  2013-01-30  4:31 ` [PATCH 0/3] Make use of new memmap= kernel parameter syntax Simon Horman
  3 siblings, 0 replies; 35+ messages in thread
From: Thomas Renninger @ 2013-01-22 15:02 UTC (permalink / raw)
  To: horms; +Cc: x86, kexec, hpa, yinghai, Thomas Renninger, vgoyal

Latest kernel (3.9 and newer) is aware of
memmap=resetusablemap (instead of memmap=exactmap).
In this case the kdump kernel uses unusable memory areas (reserved, ACPI, NVS)
as passed by e820 table via kexec. This is done in
kexec/firmware_memmap.c (collected from /sys/firmware/memmap from the original
kernel).
Therefore reserved memory areas need not to get passed via memmap=x#y or
memmap=x$y anymore.

With this kernel version on, it is also possible to pass several memmap=
options via one comma separated memmap=resetusablemap,x@y,... kernel parameter.
This patch enables kexec to make use of it.

Signed-off-by: Thomas Renninger <trenn@suse.de>
---
 kexec/arch/i386/crashdump-x86.c |   46 ++++++++++++++++++++++++++++----------
 1 files changed, 34 insertions(+), 12 deletions(-)

diff --git a/kexec/arch/i386/crashdump-x86.c b/kexec/arch/i386/crashdump-x86.c
index a900b03..9bba8d1 100644
--- a/kexec/arch/i386/crashdump-x86.c
+++ b/kexec/arch/i386/crashdump-x86.c
@@ -676,6 +676,8 @@ static void ultoa(unsigned long i, char *str)
 	}
 }
 
+static int new_memmap_syntax;
+
 /* Adds the appropriate memmap= options to command line, indicating the
  * memory regions the new kernel can use to boot into. */
 static int cmdline_add_memmap(char *cmdline, struct memory_range *memmap_p)
@@ -684,8 +686,23 @@ static int cmdline_add_memmap(char *cmdline, struct memory_range *memmap_p)
 	unsigned long min_sizek = 100;
 	char str_mmap[256], str_tmp[20];
 
-	/* Exact map */
-	strcpy(str_mmap, " memmap=exactmap");
+	/*
+	 * kernel is aware of comma separated memmap=
+	 * and memmap=resetusablemap boot param, reserved memory
+	 * areas are taken over from original e820 boot map and must
+	 * not be passed anymore, only usable memory areas:
+	 * old syntax: memmap=exactmap memmap=x@y memmap=w@z memmap=k#f
+	 * new syntax: memmap=resetusablemap,x@y,w@z
+	 */
+	if (kv_to_load >= KERNEL_VERSION(3, 7, 0))
+		new_memmap_syntax = 1;
+
+	if (new_memmap_syntax)
+		strcpy(str_mmap, " memmap=resetusablemap");
+	else
+		/* Exact map */
+		strcpy(str_mmap, " memmap=exactmap");
+
 	len = strlen(str_mmap);
 	cmdlen = strlen(cmdline) + len;
 	if (cmdlen > (COMMAND_LINE_SIZE - 1))
@@ -704,7 +721,10 @@ static int cmdline_add_memmap(char *cmdline, struct memory_range *memmap_p)
 		 * up precious command line length. */
 		if ((endk - startk) < min_sizek)
 			continue;
-		strcpy (str_mmap, " memmap=");
+		if (new_memmap_syntax)
+			strcpy (str_mmap, ",");
+		else
+			strcpy (str_mmap, " memmap=");
 		ultoa((endk-startk), str_tmp);
 		strcat (str_mmap, str_tmp);
 		strcat (str_mmap, "K@");
@@ -1035,15 +1055,17 @@ int load_crashdump_segments(struct kexec_info *info, char* mod_cmdline,
 	cmdline_add_efi(mod_cmdline);
 	cmdline_add_elfcorehdr(mod_cmdline, elfcorehdr);
 
-	/* Inform second kernel about the presence of ACPI tables. */
-	for (i = 0; i < CRASH_MAX_MEMORY_RANGES; i++) {
-		unsigned long start, end;
-		if ( !( mem_range[i].type == RANGE_ACPI
-			|| mem_range[i].type == RANGE_ACPI_NVS) )
-			continue;
-		start = mem_range[i].start;
-		end = mem_range[i].end;
-		cmdline_add_memmap_acpi(mod_cmdline, start, end);
+	if (!new_memmap_syntax) {
+		/* Inform second kernel about the presence of ACPI tables. */
+		for (i = 0; i < CRASH_MAX_MEMORY_RANGES; i++) {
+			unsigned long start, end;
+			if ( !( mem_range[i].type == RANGE_ACPI
+				|| mem_range[i].type == RANGE_ACPI_NVS) )
+				continue;
+			start = mem_range[i].start;
+			end = mem_range[i].end;
+			cmdline_add_memmap_acpi(mod_cmdline, start, end);
+		}
 	}
 	return 0;
 }
-- 
1.7.6.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Make use of new memmap= kernel parameter syntax
  2013-01-22 15:02 [PATCH 0/3] Make use of new memmap= kernel parameter syntax Thomas Renninger
                   ` (2 preceding siblings ...)
  2013-01-22 15:02 ` [PATCH 3/3] kexec x86: Make kexec aware of new memmap= kernel parameter possibilities Thomas Renninger
@ 2013-01-30  4:31 ` Simon Horman
  2013-01-30  5:40   ` H. Peter Anvin
  3 siblings, 1 reply; 35+ messages in thread
From: Simon Horman @ 2013-01-30  4:31 UTC (permalink / raw)
  To: Thomas Renninger; +Cc: yinghai, x86, kexec, vgoyal, hpa

On Tue, Jan 22, 2013 at 04:02:12PM +0100, Thomas Renninger wrote:
> Details were discussed on quite some lists (kexec, lkml, x86, etc)
> in a kernel thread with subject:
> [PATCH] x86 e820: only void usable memory areas in memmap=exactmap case
> 
> The patch that memmap= can take several arguments is already queued
> in linux-x86-tip in the mm2 branch.
> 
> There have been no objections on the introduction to memmap=resetusablemap
> for several days after discussing things in detail. The repost patches
> (will do that now) propably will show up in x86-tip mainline kernel
> tree pretty soon as well.
> Target for this feature to show up in Linus' tree is 3.9-rc1.
> Therefore these patches check for 3.9.0 and newer kernels. If such a
> kernel is tried to get loaded via kexec -p, the new memmap= syntax is used.
> 
> Please consider to apply these to the kexec-tools repository.

Hi Thomas,

could I check that the status of these patches is still as you describe above?

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Make use of new memmap= kernel parameter syntax
  2013-01-30  4:31 ` [PATCH 0/3] Make use of new memmap= kernel parameter syntax Simon Horman
@ 2013-01-30  5:40   ` H. Peter Anvin
  2013-01-30  5:52     ` Simon Horman
  2013-01-30 16:03     ` Thomas Renninger
  0 siblings, 2 replies; 35+ messages in thread
From: H. Peter Anvin @ 2013-01-30  5:40 UTC (permalink / raw)
  To: Simon Horman, Thomas Renninger; +Cc: yinghai, x86, kexec, vgoyal

The right thing to do as I have discussed with the people involved is to modify the memory map data structure to have a new memory type ID for memory which is to be dumped.  That eliminates the need to put all this info into the command line.

Simon Horman <horms@verge.net.au> wrote:

>On Tue, Jan 22, 2013 at 04:02:12PM +0100, Thomas Renninger wrote:
>> Details were discussed on quite some lists (kexec, lkml, x86, etc)
>> in a kernel thread with subject:
>> [PATCH] x86 e820: only void usable memory areas in memmap=exactmap
>case
>> 
>> The patch that memmap= can take several arguments is already queued
>> in linux-x86-tip in the mm2 branch.
>> 
>> There have been no objections on the introduction to
>memmap=resetusablemap
>> for several days after discussing things in detail. The repost
>patches
>> (will do that now) propably will show up in x86-tip mainline kernel
>> tree pretty soon as well.
>> Target for this feature to show up in Linus' tree is 3.9-rc1.
>> Therefore these patches check for 3.9.0 and newer kernels. If such a
>> kernel is tried to get loaded via kexec -p, the new memmap= syntax is
>used.
>> 
>> Please consider to apply these to the kexec-tools repository.
>
>Hi Thomas,
>
>could I check that the status of these patches is still as you describe
>above?

-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Make use of new memmap= kernel parameter syntax
  2013-01-30  5:40   ` H. Peter Anvin
@ 2013-01-30  5:52     ` Simon Horman
  2013-01-30 16:03     ` Thomas Renninger
  1 sibling, 0 replies; 35+ messages in thread
From: Simon Horman @ 2013-01-30  5:52 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: yinghai, x86, kexec, Thomas Renninger, vgoyal

On Tue, Jan 29, 2013 at 11:40:46PM -0600, H. Peter Anvin wrote:
> The right thing to do as I have discussed with the people involved is to
> modify the memory map data structure to have a new memory type ID for
> memory which is to be dumped.  That eliminates the need to put all this
> info into the command line.

Thanks. I apologise for not following the conversation more closely.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Make use of new memmap= kernel parameter syntax
  2013-01-30  5:40   ` H. Peter Anvin
  2013-01-30  5:52     ` Simon Horman
@ 2013-01-30 16:03     ` Thomas Renninger
  2013-01-30 16:06       ` [PATCH 1/3] x86 e820: Check for exactmap appearance when parsing first memmap option Thomas Renninger
                         ` (4 more replies)
  1 sibling, 5 replies; 35+ messages in thread
From: Thomas Renninger @ 2013-01-30 16:03 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: yinghai, Simon Horman, kexec, x86, vgoyal

On Wednesday, January 30, 2013 06:40:46 AM H. Peter Anvin wrote:
> The right thing to do as I have discussed with the people involved is to
> modify the memory map data structure to have a new memory type ID for
> memory which is to be dumped.  That eliminates the need to put all this
> info into the command line.
As said before..., after looking closely into this I am convinced that
passing a kexec-tools modified e820 table makes things unnecessary complex.

So I suggest to still pass the original e820 table and kexec-tools
only needs to pass the memory to use (usable) where the kdump kernel
resides in via memmap=X@Y.
This:
  - heavily cleans up the unnecesary reserved memory passing via memmap=
  - still provides a clean way of passing a valid e820 table through boot
    structures (no Linux kernel made up e820 type passing)
  - fixes mmconf and possible other bugs
  - provides the kdump kernel with all availabe info, it then knows about:
    original e820 map (all reserved, usable, etc. as specified by BIOS)
    kdump kernel memory area, former (crashed kernel) usable memory.
  - Keeps complexity as low as possible and at one place and does not
    involve kexec-tools as another error source (passing a badly
    mangled e820 table or not being able to consider stuff the kernel
    can when mangeling).

Having that said I will reply to this mail with two kernel patches
which implement this.
The 3 kexec-tools patches would be more or less the same, only the
string memmap=resetusable has to be exchanged to
memmap=kdump_reserve_usable
I wait until the kernel patches are in x86-tip and queued for
3.9 and will resend them then.

Please consider to apply the two kernel patches already.
But I can also resend in a new thread.

   Thomas

Ah yes, here the test results:
Kdump specific boot parameter appends before:
---------------------------------------------
memmap=exactmap memmap=559K@64K memmap=261560K@638976K elfcorehdr=900536K memmap=488K#3034988K memmap=3076K#3067744K memmap=2048K#3106804K memmap=676K#3110812K 
memmap=4K#3111488K memmap=476K#3111492K memmap=4K#3111968K memmap=8K#3111972K memmap=4K#3111980K memmap=4K#3111984K memmap=92K#3111988K memmap=564K#3112080K

Kdump specific boot parameter appends now:
------------------------------------------
memmap=kdump_reserve_usable,559K@64K,261560K@638976K elfcorehdr=900536K


important parts of the serial console output of the
---------------------------------------------------
modified kdump kernel:
----------------------
[691036.954392] RIP  [<ffffffff812f368d>] sysrq_handle_crash+0xd/0x20
[691036.968140]  RSP <ffff88042473de90>
[691036.976113] CR2: 0000000000000000
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Linux version 3.8.0-rc5-default+ (trenn@ett) (gcc version 4.5.1 20101208 [gcc-4_5-branch revision 167585] (SUSE Linux) ) #4 SMP Wed Jan 30 15:45:36 
CET 2013
[    0.000000] Command line: root=/dev/disk/by-id/ata-Hitachi_HDS721016CLA382_JPAB40HM2KUK6B-part6 console=tty0 console=ttyS0,57600 sysrq_always_enabled panic=100 
ignore_loglevel resume=/dev/disk/by-id/ata-Hitachi_HDS721016CLA382_JPAB40HM2KUK6B-part2 apic=verbose debug vga=normal elevator=deadline sysrq=yes reset_devices 
irqpoll maxcpus=1   memmap=kdump_reserve_usable,559K@64K,261560K@638976K elfcorehdr=900536K
[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000100-0x000000000009bbff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009bc00-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000b93dafff] usable
[    0.000000] BIOS-e820: [mem 0x00000000b93db000-0x00000000b9454fff] ACPI data
[    0.000000] BIOS-e820: [mem 0x00000000b9455000-0x00000000bb155fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000bb156000-0x00000000bb166fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000bb167000-0x00000000bb3d7fff] usable
[    0.000000] BIOS-e820: [mem 0x00000000bb3d8000-0x00000000bb6d8fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000bb6d9000-0x00000000bd9fcfff] usable
[    0.000000] BIOS-e820: [mem 0x00000000bd9fd000-0x00000000bdbfcfff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000bdbfd000-0x00000000bdcdcfff] usable
[    0.000000] BIOS-e820: [mem 0x00000000bdcdd000-0x00000000bdde6fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000bdde7000-0x00000000bde8ffff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000bde90000-0x00000000bde90fff] ACPI data
[    0.000000] BIOS-e820: [mem 0x00000000bde91000-0x00000000bdf07fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000bdf08000-0x00000000bdf08fff] ACPI data
[    0.000000] BIOS-e820: [mem 0x00000000bdf09000-0x00000000bdf0afff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000bdf0b000-0x00000000bdf0bfff] ACPI data
[    0.000000] BIOS-e820: [mem 0x00000000bdf0c000-0x00000000bdf0cfff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000bdf0d000-0x00000000bdf23fff] ACPI data
[    0.000000] BIOS-e820: [mem 0x00000000bdf24000-0x00000000bdfb0fff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x00000000bdfb1000-0x00000000bdffffff] usable
[    0.000000] BIOS-e820: [mem 0x00000000be000000-0x00000000cfffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed19000-0x00000000fed19fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000ffa20000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000083fffffff] usable
[    0.000000] debug: ignoring loglevel setting.
[    0.000000] e820: last_pfn = 0x840000 max_arch_pfn = 0x400000000
[    0.000000] e820: update [mem 0x00000000-0xfffffffffffffffe] usable ==> kdump reserved
[    0.000000] e820: remove [mem 0x00010000-0x0009bbff] 
[    0.000000] e820: remove [mem 0x27000000-0x36f6dfff] 
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] e820: user-defined physical RAM map:
[    0.000000] user: [mem 0x0000000000000100-0x000000000000ffff] kdump reserved
[    0.000000] user: [mem 0x0000000000010000-0x000000000009bbff] usable
[    0.000000] user: [mem 0x000000000009bc00-0x000000000009ffff] reserved
[    0.000000] user: [mem 0x00000000000e0000-0x00000000000fffff] reserved
[    0.000000] user: [mem 0x0000000000100000-0x0000000026ffffff] kdump reserved
[    0.000000] user: [mem 0x0000000027000000-0x0000000036f6dfff] usable
[    0.000000] user: [mem 0x0000000036f6e000-0x00000000b93dafff] kdump reserved
[    0.000000] user: [mem 0x00000000b93db000-0x00000000b9454fff] ACPI data
[    0.000000] user: [mem 0x00000000b9455000-0x00000000bb155fff] kdump reserved
[    0.000000] user: [mem 0x00000000bb156000-0x00000000bb166fff] reserved
[    0.000000] user: [mem 0x00000000bb167000-0x00000000bb3d7fff] kdump reserved
[    0.000000] user: [mem 0x00000000bb3d8000-0x00000000bb6d8fff] ACPI NVS
[    0.000000] user: [mem 0x00000000bb6d9000-0x00000000bd9fcfff] kdump reserved
[    0.000000] user: [mem 0x00000000bd9fd000-0x00000000bdbfcfff] ACPI NVS
[    0.000000] user: [mem 0x00000000bdbfd000-0x00000000bdcdcfff] kdump reserved
[    0.000000] user: [mem 0x00000000bdcdd000-0x00000000bdde6fff] reserved
[    0.000000] user: [mem 0x00000000bdde7000-0x00000000bde8ffff] ACPI NVS
[    0.000000] user: [mem 0x00000000bde90000-0x00000000bde90fff] ACPI data
[    0.000000] user: [mem 0x00000000bde91000-0x00000000bdf07fff] ACPI NVS
[    0.000000] user: [mem 0x00000000bdf08000-0x00000000bdf08fff] ACPI data
[    0.000000] user: [mem 0x00000000bdf09000-0x00000000bdf0afff] ACPI NVS
[    0.000000] user: [mem 0x00000000bdf0b000-0x00000000bdf0bfff] ACPI data
[    0.000000] user: [mem 0x00000000bdf0c000-0x00000000bdf0cfff] ACPI NVS
[    0.000000] user: [mem 0x00000000bdf0d000-0x00000000bdf23fff] ACPI data
[    0.000000] user: [mem 0x00000000bdf24000-0x00000000bdfb0fff] ACPI NVS
[    0.000000] user: [mem 0x00000000bdfb1000-0x00000000bdffffff] kdump reserved
[    0.000000] user: [mem 0x00000000be000000-0x00000000cfffffff] reserved
[    0.000000] user: [mem 0x00000000fec00000-0x00000000fec00fff] reserved
[    0.000000] user: [mem 0x00000000fed19000-0x00000000fed19fff] reserved
[    0.000000] user: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved
[    0.000000] user: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
[    0.000000] user: [mem 0x00000000ffa20000-0x00000000ffffffff] reserved
[    0.000000] user: [mem 0x0000000100000000-0x000000083fffffff] kdump reserved
[    0.000000] SMBIOS 2.6 present.
[    0.000000] DMI: Intel Corporation S2600CP/S2600CP, BIOS SE5C600.86B.99.99.x040.111920110024 11/19/2011
[    0.000000] e820: update [mem 0x00000000-0x0000ffff] usable ==> reserved
[    0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
[    0.000000] No AGP bridge found
...
[    2.220298] ACPI: bus type pci registered
[    2.229486] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0xc0000000-0xcfffffff] (base 0xc0000000)
[    2.250283] PCI: MMCONFIG at [mem 0xc0000000-0xcfffffff] reserved in E820
[    2.349583] PCI: Using configuration type 1 for base access
...
Copying data                       : [100 %]
The dumpfile is saved to /root/abuild/dumps/2013-01-30-15:52/vmcore.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 1/3] x86 e820: Check for exactmap appearance when parsing first memmap option
  2013-01-30 16:03     ` Thomas Renninger
@ 2013-01-30 16:06       ` Thomas Renninger
  2013-01-30 16:09         ` H. Peter Anvin
  2013-01-30 16:08       ` [PATCH 2/3] x86: Introduce Linux kernel specific E820_RESERVED_KDUMP e820 memory range type Thomas Renninger
                         ` (3 subsequent siblings)
  4 siblings, 1 reply; 35+ messages in thread
From: Thomas Renninger @ 2013-01-30 16:06 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: yinghai, Simon Horman, kexec, x86, vgoyal

From: Yinghai Lu <yinghai@kernel.org>

memmap=exactmap will throw away all original, but also until then
user defined (through other provided memmap= parameters) areas.
That means all memmap= boot parameters passed before a memmap=exactmap
parameter are not recognized.
Without this fix:
memmap=x@y memmap=exactmap memmap=i#k
only i#k would get recognized.

This is wrong, this fix will only throw away all original e820 areas once
when memmap=exactmap is found in the whole boot command line and before
any other memmap= option is parsed.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Reviewed-by: Thomas Renninger <trenn@suse.de>
---
 arch/x86/kernel/e820.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)

Index: linux-2.6/arch/x86/kernel/e820.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/e820.c
+++ linux-2.6/arch/x86/kernel/e820.c
@@ -835,6 +835,8 @@ static int __init parse_memopt(char *p)
 }
 early_param("mem", parse_memopt);
 
+static bool __initdata exactmap_parsed;
+
 static int __init parse_memmap_one(char *p)
 {
 	char *oldp;
@@ -844,6 +846,10 @@ static int __init parse_memmap_one(char
 		return -EINVAL;
 
 	if (!strncmp(p, "exactmap", 8)) {
+		if (exactmap_parsed)
+			return 0;
+
+		exactmap_parsed = true;
 #ifdef CONFIG_CRASH_DUMP
 		/*
 		 * If we are doing a crash dump, we still need to know
@@ -879,6 +885,12 @@ static int __init parse_memmap_one(char
 }
 static int __init parse_memmap_opt(char *str)
 {
+	char *p = boot_command_line;
+
+	p = strstr(p, "exactmap");
+	if (p)
+		parse_memmap_one("exactmap");
+
 	while (str) {
 		char *k = strchr(str, ',');
 


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 2/3] x86: Introduce Linux kernel specific E820_RESERVED_KDUMP e820 memory range type
  2013-01-30 16:03     ` Thomas Renninger
  2013-01-30 16:06       ` [PATCH 1/3] x86 e820: Check for exactmap appearance when parsing first memmap option Thomas Renninger
@ 2013-01-30 16:08       ` Thomas Renninger
  2013-01-30 16:10       ` [PATCH 3/3] x86 e820: Introduce memmap=kdump_reserve_usable for kdump usage Thomas Renninger
                         ` (2 subsequent siblings)
  4 siblings, 0 replies; 35+ messages in thread
From: Thomas Renninger @ 2013-01-30 16:08 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: yinghai, Simon Horman, kexec, x86, vgoyal

This functionality will be picked up by later memmap= boot
parameter options. Originally E820_USABLE declared memory (used by the
productive, but crashed kernel) will get converted to the new
E820_RESERVED_KDUMP type.
The memory area where the kdump kernel resides (passed via memmap=X@Y
by kexec-tools) will be set usable again.

Also include a tiny whitespace to tab cleanup.

Signed-off-by: Thomas Renninger <trenn@suse.de>
---
 arch/x86/include/uapi/asm/e820.h |    9 ++++++++-
 arch/x86/kernel/e820.c           |   14 +++++++++-----
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/uapi/asm/e820.h b/arch/x86/include/uapi/asm/e820.h
index bbae024..2171ca5 100644
--- a/arch/x86/include/uapi/asm/e820.h
+++ b/arch/x86/include/uapi/asm/e820.h
@@ -45,7 +45,14 @@
  * included in the S3 integrity calculation and so should not include
  * any memory that BIOS might alter over the S3 transition
  */
-#define E820_RESERVED_KERN        128
+#define E820_RESERVED_KERN	128
+
+/*
+ * Kdump kernel will use this type for formerly usable memory
+ * the crashed kernel used (and as defined by the original e820 map).
+ * It will then only set the memory area it resides in to usable memory
+ */
+#define E820_RESERVED_KDUMP	129
 
 #ifndef __ASSEMBLY__
 #include <linux/types.h>
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index d32abea..e3c5b7f 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -149,6 +149,9 @@ static void __init e820_print_type(u32 type)
 	case E820_UNUSABLE:
 		printk(KERN_CONT "unusable");
 		break;
+	case E820_RESERVED_KDUMP:
+		printk(KERN_CONT "kdump reserved");
+		break;
 	default:
 		printk(KERN_CONT "type %u", type);
 		break;
@@ -911,11 +914,12 @@ static inline const char *e820_type_to_string(int e820_type)
 {
 	switch (e820_type) {
 	case E820_RESERVED_KERN:
-	case E820_RAM:	return "System RAM";
-	case E820_ACPI:	return "ACPI Tables";
-	case E820_NVS:	return "ACPI Non-volatile Storage";
-	case E820_UNUSABLE:	return "Unusable memory";
-	default:	return "reserved";
+	case E820_RAM:			return "System RAM";
+	case E820_ACPI:			return "ACPI Tables";
+	case E820_NVS:			return "ACPI Non-volatile Storage";
+	case E820_UNUSABLE:		return "Unusable memory";
+	case E820_RESERVED_KDUMP:	return "Kdump reserved";
+	default:			return "reserved";
 	}
 }
 

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/3] x86 e820: Check for exactmap appearance when parsing first memmap option
  2013-01-30 16:06       ` [PATCH 1/3] x86 e820: Check for exactmap appearance when parsing first memmap option Thomas Renninger
@ 2013-01-30 16:09         ` H. Peter Anvin
  0 siblings, 0 replies; 35+ messages in thread
From: H. Peter Anvin @ 2013-01-30 16:09 UTC (permalink / raw)
  To: Thomas Renninger; +Cc: yinghai, Simon Horman, kexec, x86, vgoyal

Once again, what is "wrong" about this... the semantics are consistent, and breaking them when long established makes no sense.

Thomas Renninger <trenn@suse.de> wrote:

>From: Yinghai Lu <yinghai@kernel.org>
>
>memmap=exactmap will throw away all original, but also until then
>user defined (through other provided memmap= parameters) areas.
>That means all memmap= boot parameters passed before a memmap=exactmap
>parameter are not recognized.
>Without this fix:
>memmap=x@y memmap=exactmap memmap=i#k
>only i#k would get recognized.
>
>This is wrong, this fix will only throw away all original e820 areas
>once
>when memmap=exactmap is found in the whole boot command line and before
>any other memmap= option is parsed.
>
>Signed-off-by: Yinghai Lu <yinghai@kernel.org>
>Reviewed-by: Thomas Renninger <trenn@suse.de>
>---
> arch/x86/kernel/e820.c |   12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
>Index: linux-2.6/arch/x86/kernel/e820.c
>===================================================================
>--- linux-2.6.orig/arch/x86/kernel/e820.c
>+++ linux-2.6/arch/x86/kernel/e820.c
>@@ -835,6 +835,8 @@ static int __init parse_memopt(char *p)
> }
> early_param("mem", parse_memopt);
> 
>+static bool __initdata exactmap_parsed;
>+
> static int __init parse_memmap_one(char *p)
> {
> 	char *oldp;
>@@ -844,6 +846,10 @@ static int __init parse_memmap_one(char
> 		return -EINVAL;
> 
> 	if (!strncmp(p, "exactmap", 8)) {
>+		if (exactmap_parsed)
>+			return 0;
>+
>+		exactmap_parsed = true;
> #ifdef CONFIG_CRASH_DUMP
> 		/*
> 		 * If we are doing a crash dump, we still need to know
>@@ -879,6 +885,12 @@ static int __init parse_memmap_one(char
> }
> static int __init parse_memmap_opt(char *str)
> {
>+	char *p = boot_command_line;
>+
>+	p = strstr(p, "exactmap");
>+	if (p)
>+		parse_memmap_one("exactmap");
>+
> 	while (str) {
> 		char *k = strchr(str, ',');
> 

-- 
Sent from my mobile phone. Please excuse brevity and lack of formatting.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 3/3] x86 e820: Introduce memmap=kdump_reserve_usable for kdump usage
  2013-01-30 16:03     ` Thomas Renninger
  2013-01-30 16:06       ` [PATCH 1/3] x86 e820: Check for exactmap appearance when parsing first memmap option Thomas Renninger
  2013-01-30 16:08       ` [PATCH 2/3] x86: Introduce Linux kernel specific E820_RESERVED_KDUMP e820 memory range type Thomas Renninger
@ 2013-01-30 16:10       ` Thomas Renninger
  2013-01-30 16:10       ` [PATCH 0/3] Make use of new memmap= kernel parameter syntax H. Peter Anvin
  2013-01-30 16:13       ` [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage Thomas Renninger
  4 siblings, 0 replies; 35+ messages in thread
From: Thomas Renninger @ 2013-01-30 16:10 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: yinghai, Simon Horman, kexec, x86, vgoyal

kdump voided the whole original e820 map and half way made
it up via memmap= options passed via kdump boot params again.

But this is conceptionally wrong. The whole original memory ranges
which are declared reserved, ACPI data/nvs or however are not usable
must stay the same and get honored by the kdump kernel.

Therefore memmap=kdump_reserve_usable gets introduced.
kdump passes this one and originally usable memory gets converted
by the kernel to the Linux kernel specific kdump reserved e820 type.
kdump passes the usable memory ranges where the kdump kernel resides
in via memmap=x@y parameter(s).

This preserves all e820 information and the kdump kernel can look
it up and be sure it is valid.
It also cleans up unnecessary memmap= boot parameter passing of
reserved memory areas via kexec-tools. This information is already
passed via boot structures.

This also fixes mmconf (extended PCI config access) and
possibly other kernel parts which rely on remapped memory to be
in reserved or ACPI (data/nvs) declared e820 memory areas.

Signed-off-by: Thomas Renninger <trenn@suse.de>
---
 Documentation/kernel-parameters.txt |    9 +++++++++
 arch/x86/kernel/e820.c              |   22 ++++++++++++++++++++--
 2 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index da0e077..f38375a 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1518,6 +1518,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 			BIOS output or other requirements. See the memmap=nn@ss
 			option description.
 
+	memmap=kdump_reserve_usable
+			[KNL,X86] Convert usable memory areas into a special
+			Linux kernel specific kdump reserved type.
+			Usable memory areas the kdump kernel resides in have
+			to be passed via memmap=X@Y parameters which overrides
+			these areas again to be usable.
+			This boot parameter is intended for kdump usage.
+			Passing exactmap overrules kdump_reserve_usable.
+
 	memmap=nn[KMG]@ss[KMG]
 			[KNL] Force usage of a specific region of memory
 			Region of memory to be used, from ss to ss+nn.
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 7c6a72e..9c00bfa 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -839,6 +839,7 @@ static int __init parse_memopt(char *p)
 early_param("mem", parse_memopt);
 
 static bool __initdata exactmap_parsed;
+static bool __initdata kdump_res_parsed;
 
 static int __init parse_memmap_one(char *p)
 {
@@ -848,7 +849,8 @@ static int __init parse_memmap_one(char *p)
 	if (!p)
 		return -EINVAL;
 
-	if (!strncmp(p, "exactmap", 8)) {
+	if (!strncmp(p, "exactmap", 8) ||
+	    !strncmp(p, "kdump_reserve_usable", 20)) {
 		if (exactmap_parsed)
 			return 0;
 
@@ -861,7 +863,13 @@ static int __init parse_memmap_one(char *p)
 		 */
 		saved_max_pfn = e820_end_of_ram_pfn();
 #endif
-		e820.nr_map = 0;
+		if (!strncmp(p, "kdump_reserve_usable", 20)) {
+			/* remove all old E820_RAM ranges */
+			e820_update_range(0, ULLONG_MAX, E820_RAM,
+					  E820_RESERVED_KDUMP);
+			kdump_res_parsed = true;
+		} else
+			e820.nr_map = 0;
 		userdef = 1;
 		return 0;
 	}
@@ -874,6 +882,11 @@ static int __init parse_memmap_one(char *p)
 	userdef = 1;
 	if (*p == '@') {
 		start_at = memparse(p+1, &p);
+		if (kdump_res_parsed) {
+			/* Remove old reserved so new ram could take over. */
+			e820_remove_range(start_at, mem_size,
+					  E820_RESERVED_KDUMP, 0);
+		}
 		e820_add_region(start_at, mem_size, E820_RAM);
 	} else if (*p == '#') {
 		start_at = memparse(p+1, &p);
@@ -893,6 +906,11 @@ static int __init parse_memmap_opt(char *str)
 	p = strstr(p, "exactmap");
 	if (p)
 		parse_memmap_one("exactmap");
+	else {
+		p = strstr(boot_command_line, "kdump_reserve_usable");
+		if (p)
+			parse_memmap_one("kdump_reserve_usable");
+	}
 
 	while (str) {
 		char *k = strchr(str, ',');

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Make use of new memmap= kernel parameter syntax
  2013-01-30 16:03     ` Thomas Renninger
                         ` (2 preceding siblings ...)
  2013-01-30 16:10       ` [PATCH 3/3] x86 e820: Introduce memmap=kdump_reserve_usable for kdump usage Thomas Renninger
@ 2013-01-30 16:10       ` H. Peter Anvin
  2013-01-30 16:13       ` [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage Thomas Renninger
  4 siblings, 0 replies; 35+ messages in thread
From: H. Peter Anvin @ 2013-01-30 16:10 UTC (permalink / raw)
  To: Thomas Renninger; +Cc: yinghai, Simon Horman, kexec, x86, vgoyal

I don't see why this is "unnecessarily complex" compared to this stuff...

Thomas Renninger <trenn@suse.de> wrote:

>On Wednesday, January 30, 2013 06:40:46 AM H. Peter Anvin wrote:
>> The right thing to do as I have discussed with the people involved is
>to
>> modify the memory map data structure to have a new memory type ID for
>> memory which is to be dumped.  That eliminates the need to put all
>this
>> info into the command line.
>As said before..., after looking closely into this I am convinced that
>passing a kexec-tools modified e820 table makes things unnecessary
>complex.
>
>So I suggest to still pass the original e820 table and kexec-tools
>only needs to pass the memory to use (usable) where the kdump kernel
>resides in via memmap=X@Y.
>This:
> - heavily cleans up the unnecesary reserved memory passing via memmap=
>- still provides a clean way of passing a valid e820 table through boot
>    structures (no Linux kernel made up e820 type passing)
>  - fixes mmconf and possible other bugs
>- provides the kdump kernel with all availabe info, it then knows
>about:
>    original e820 map (all reserved, usable, etc. as specified by BIOS)
>    kdump kernel memory area, former (crashed kernel) usable memory.
>  - Keeps complexity as low as possible and at one place and does not
>    involve kexec-tools as another error source (passing a badly
>    mangled e820 table or not being able to consider stuff the kernel
>    can when mangeling).
>
>Having that said I will reply to this mail with two kernel patches
>which implement this.
>The 3 kexec-tools patches would be more or less the same, only the
>string memmap=resetusable has to be exchanged to
>memmap=kdump_reserve_usable
>I wait until the kernel patches are in x86-tip and queued for
>3.9 and will resend them then.
>
>Please consider to apply the two kernel patches already.
>But I can also resend in a new thread.
>
>   Thomas
>
>Ah yes, here the test results:
>Kdump specific boot parameter appends before:
>---------------------------------------------
>memmap=exactmap memmap=559K@64K memmap=261560K@638976K
>elfcorehdr=900536K memmap=488K#3034988K memmap=3076K#3067744K
>memmap=2048K#3106804K memmap=676K#3110812K 
>memmap=4K#3111488K memmap=476K#3111492K memmap=4K#3111968K
>memmap=8K#3111972K memmap=4K#3111980K memmap=4K#3111984K
>memmap=92K#3111988K memmap=564K#3112080K
>
>Kdump specific boot parameter appends now:
>------------------------------------------
>memmap=kdump_reserve_usable,559K@64K,261560K@638976K elfcorehdr=900536K
>
>
>important parts of the serial console output of the
>---------------------------------------------------
>modified kdump kernel:
>----------------------
>[691036.954392] RIP  [<ffffffff812f368d>] sysrq_handle_crash+0xd/0x20
>[691036.968140]  RSP <ffff88042473de90>
>[691036.976113] CR2: 0000000000000000
>[    0.000000] Initializing cgroup subsys cpuset
>[    0.000000] Initializing cgroup subsys cpu
>[    0.000000] Linux version 3.8.0-rc5-default+ (trenn@ett) (gcc
>version 4.5.1 20101208 [gcc-4_5-branch revision 167585] (SUSE Linux) )
>#4 SMP Wed Jan 30 15:45:36 
>CET 2013
>[    0.000000] Command line:
>root=/dev/disk/by-id/ata-Hitachi_HDS721016CLA382_JPAB40HM2KUK6B-part6
>console=tty0 console=ttyS0,57600 sysrq_always_enabled panic=100 
>ignore_loglevel
>resume=/dev/disk/by-id/ata-Hitachi_HDS721016CLA382_JPAB40HM2KUK6B-part2
>apic=verbose debug vga=normal elevator=deadline sysrq=yes reset_devices
>
>irqpoll maxcpus=1  
>memmap=kdump_reserve_usable,559K@64K,261560K@638976K elfcorehdr=900536K
>[    0.000000] e820: BIOS-provided physical RAM map:
>[    0.000000] BIOS-e820: [mem 0x0000000000000100-0x000000000009bbff]
>usable
>[    0.000000] BIOS-e820: [mem 0x000000000009bc00-0x000000000009ffff]
>reserved
>[    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff]
>reserved
>[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000b93dafff]
>usable
>[    0.000000] BIOS-e820: [mem 0x00000000b93db000-0x00000000b9454fff]
>ACPI data
>[    0.000000] BIOS-e820: [mem 0x00000000b9455000-0x00000000bb155fff]
>usable
>[    0.000000] BIOS-e820: [mem 0x00000000bb156000-0x00000000bb166fff]
>reserved
>[    0.000000] BIOS-e820: [mem 0x00000000bb167000-0x00000000bb3d7fff]
>usable
>[    0.000000] BIOS-e820: [mem 0x00000000bb3d8000-0x00000000bb6d8fff]
>ACPI NVS
>[    0.000000] BIOS-e820: [mem 0x00000000bb6d9000-0x00000000bd9fcfff]
>usable
>[    0.000000] BIOS-e820: [mem 0x00000000bd9fd000-0x00000000bdbfcfff]
>ACPI NVS
>[    0.000000] BIOS-e820: [mem 0x00000000bdbfd000-0x00000000bdcdcfff]
>usable
>[    0.000000] BIOS-e820: [mem 0x00000000bdcdd000-0x00000000bdde6fff]
>reserved
>[    0.000000] BIOS-e820: [mem 0x00000000bdde7000-0x00000000bde8ffff]
>ACPI NVS
>[    0.000000] BIOS-e820: [mem 0x00000000bde90000-0x00000000bde90fff]
>ACPI data
>[    0.000000] BIOS-e820: [mem 0x00000000bde91000-0x00000000bdf07fff]
>ACPI NVS
>[    0.000000] BIOS-e820: [mem 0x00000000bdf08000-0x00000000bdf08fff]
>ACPI data
>[    0.000000] BIOS-e820: [mem 0x00000000bdf09000-0x00000000bdf0afff]
>ACPI NVS
>[    0.000000] BIOS-e820: [mem 0x00000000bdf0b000-0x00000000bdf0bfff]
>ACPI data
>[    0.000000] BIOS-e820: [mem 0x00000000bdf0c000-0x00000000bdf0cfff]
>ACPI NVS
>[    0.000000] BIOS-e820: [mem 0x00000000bdf0d000-0x00000000bdf23fff]
>ACPI data
>[    0.000000] BIOS-e820: [mem 0x00000000bdf24000-0x00000000bdfb0fff]
>ACPI NVS
>[    0.000000] BIOS-e820: [mem 0x00000000bdfb1000-0x00000000bdffffff]
>usable
>[    0.000000] BIOS-e820: [mem 0x00000000be000000-0x00000000cfffffff]
>reserved
>[    0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff]
>reserved
>[    0.000000] BIOS-e820: [mem 0x00000000fed19000-0x00000000fed19fff]
>reserved
>[    0.000000] BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff]
>reserved
>[    0.000000] BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff]
>reserved
>[    0.000000] BIOS-e820: [mem 0x00000000ffa20000-0x00000000ffffffff]
>reserved
>[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000083fffffff]
>usable
>[    0.000000] debug: ignoring loglevel setting.
>[    0.000000] e820: last_pfn = 0x840000 max_arch_pfn = 0x400000000
>[    0.000000] e820: update [mem 0x00000000-0xfffffffffffffffe] usable
>==> kdump reserved
>[    0.000000] e820: remove [mem 0x00010000-0x0009bbff] 
>[    0.000000] e820: remove [mem 0x27000000-0x36f6dfff] 
>[    0.000000] NX (Execute Disable) protection: active
>[    0.000000] e820: user-defined physical RAM map:
>[    0.000000] user: [mem 0x0000000000000100-0x000000000000ffff] kdump
>reserved
>[    0.000000] user: [mem 0x0000000000010000-0x000000000009bbff] usable
>[    0.000000] user: [mem 0x000000000009bc00-0x000000000009ffff]
>reserved
>[    0.000000] user: [mem 0x00000000000e0000-0x00000000000fffff]
>reserved
>[    0.000000] user: [mem 0x0000000000100000-0x0000000026ffffff] kdump
>reserved
>[    0.000000] user: [mem 0x0000000027000000-0x0000000036f6dfff] usable
>[    0.000000] user: [mem 0x0000000036f6e000-0x00000000b93dafff] kdump
>reserved
>[    0.000000] user: [mem 0x00000000b93db000-0x00000000b9454fff] ACPI
>data
>[    0.000000] user: [mem 0x00000000b9455000-0x00000000bb155fff] kdump
>reserved
>[    0.000000] user: [mem 0x00000000bb156000-0x00000000bb166fff]
>reserved
>[    0.000000] user: [mem 0x00000000bb167000-0x00000000bb3d7fff] kdump
>reserved
>[    0.000000] user: [mem 0x00000000bb3d8000-0x00000000bb6d8fff] ACPI
>NVS
>[    0.000000] user: [mem 0x00000000bb6d9000-0x00000000bd9fcfff] kdump
>reserved
>[    0.000000] user: [mem 0x00000000bd9fd000-0x00000000bdbfcfff] ACPI
>NVS
>[    0.000000] user: [mem 0x00000000bdbfd000-0x00000000bdcdcfff] kdump
>reserved
>[    0.000000] user: [mem 0x00000000bdcdd000-0x00000000bdde6fff]
>reserved
>[    0.000000] user: [mem 0x00000000bdde7000-0x00000000bde8ffff] ACPI
>NVS
>[    0.000000] user: [mem 0x00000000bde90000-0x00000000bde90fff] ACPI
>data
>[    0.000000] user: [mem 0x00000000bde91000-0x00000000bdf07fff] ACPI
>NVS
>[    0.000000] user: [mem 0x00000000bdf08000-0x00000000bdf08fff] ACPI
>data
>[    0.000000] user: [mem 0x00000000bdf09000-0x00000000bdf0afff] ACPI
>NVS
>[    0.000000] user: [mem 0x00000000bdf0b000-0x00000000bdf0bfff] ACPI
>data
>[    0.000000] user: [mem 0x00000000bdf0c000-0x00000000bdf0cfff] ACPI
>NVS
>[    0.000000] user: [mem 0x00000000bdf0d000-0x00000000bdf23fff] ACPI
>data
>[    0.000000] user: [mem 0x00000000bdf24000-0x00000000bdfb0fff] ACPI
>NVS
>[    0.000000] user: [mem 0x00000000bdfb1000-0x00000000bdffffff] kdump
>reserved
>[    0.000000] user: [mem 0x00000000be000000-0x00000000cfffffff]
>reserved
>[    0.000000] user: [mem 0x00000000fec00000-0x00000000fec00fff]
>reserved
>[    0.000000] user: [mem 0x00000000fed19000-0x00000000fed19fff]
>reserved
>[    0.000000] user: [mem 0x00000000fed1c000-0x00000000fed1ffff]
>reserved
>[    0.000000] user: [mem 0x00000000fee00000-0x00000000fee00fff]
>reserved
>[    0.000000] user: [mem 0x00000000ffa20000-0x00000000ffffffff]
>reserved
>[    0.000000] user: [mem 0x0000000100000000-0x000000083fffffff] kdump
>reserved
>[    0.000000] SMBIOS 2.6 present.
>[    0.000000] DMI: Intel Corporation S2600CP/S2600CP, BIOS
>SE5C600.86B.99.99.x040.111920110024 11/19/2011
>[    0.000000] e820: update [mem 0x00000000-0x0000ffff] usable ==>
>reserved
>[    0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
>[    0.000000] No AGP bridge found
>...
>[    2.220298] ACPI: bus type pci registered
>[    2.229486] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem
>0xc0000000-0xcfffffff] (base 0xc0000000)
>[    2.250283] PCI: MMCONFIG at [mem 0xc0000000-0xcfffffff] reserved in
>E820
>[    2.349583] PCI: Using configuration type 1 for base access
>...
>Copying data                       : [100 %]
>The dumpfile is saved to /root/abuild/dumps/2013-01-30-15:52/vmcore.

-- 
Sent from my mobile phone. Please excuse brevity and lack of formatting.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-01-30 16:03     ` Thomas Renninger
                         ` (3 preceding siblings ...)
  2013-01-30 16:10       ` [PATCH 0/3] Make use of new memmap= kernel parameter syntax H. Peter Anvin
@ 2013-01-30 16:13       ` Thomas Renninger
  2013-01-30 16:16         ` H. Peter Anvin
  4 siblings, 1 reply; 35+ messages in thread
From: Thomas Renninger @ 2013-01-30 16:13 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: yinghai, Simon Horman, kexec, x86, vgoyal

The sent patches are against latest x86-tip origin/mm2 branch
(3.8.0-rc5) based.

Please consider to apply them to x86-tip.
I will then resend the kexec-tools adjustings to the kexec
list with the same people in CC as in this one.

Thanks,

   Thomas

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-01-30 16:13       ` [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage Thomas Renninger
@ 2013-01-30 16:16         ` H. Peter Anvin
  2013-01-30 16:39           ` Thomas Renninger
  0 siblings, 1 reply; 35+ messages in thread
From: H. Peter Anvin @ 2013-01-30 16:16 UTC (permalink / raw)
  To: Thomas Renninger; +Cc: yinghai, Simon Horman, kexec, x86, vgoyal

I am NAKing 1/3 and think you seriously need to explain your design choice for the rest.  "Unnecessarily complex" is not an explanation, it is a cop-out.

Thomas Renninger <trenn@suse.de> wrote:

>The sent patches are against latest x86-tip origin/mm2 branch
>(3.8.0-rc5) based.
>
>Please consider to apply them to x86-tip.
>I will then resend the kexec-tools adjustings to the kexec
>list with the same people in CC as in this one.
>
>Thanks,
>
>   Thomas

-- 
Sent from my mobile phone. Please excuse brevity and lack of formatting.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-01-30 16:16         ` H. Peter Anvin
@ 2013-01-30 16:39           ` Thomas Renninger
  2013-01-30 16:52             ` H. Peter Anvin
  0 siblings, 1 reply; 35+ messages in thread
From: Thomas Renninger @ 2013-01-30 16:39 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: yinghai, Simon Horman, kexec, x86, vgoyal

On Wednesday, January 30, 2013 05:16:22 PM H. Peter Anvin wrote:
> I am NAKing 1/3 and think you seriously need to explain your design
> choice for the rest.  "Unnecessarily complex" is not an explanation, it
> is a cop-out.

Ok, 1/3 should not go in because it changes boot param processing
which works like that for years. No problem.

For the rest I explain the advantages here:

> >This:
> > - heavily cleans up the unnecesary reserved memory passing via memmap=
> > - still provides a clean way of passing a valid e820 table through
> >   boot structures (no Linux kernel made up e820 type passing)
> > - Keeps complexity as low as possible and at one place and does not
> >   involve kexec-tools as another error source (passing a badly
> >   mangled e820 table or not being able to consider stuff the kernel
> >   can when mangeling).

If for some reason the e820 table in kdump case needs to be
touched again, I am pretty sure you do not want to look up
kexec-tools code.
Also you won't be able to fix/workaround things in kexec-tools
the way you can in the kernel.

I also do not think it's a good idea to pass an unspecified,
Linux kernel made up e820 type through the public boot interface.

So looking from the other side, passing a modified e820 table
only has disadvantages. The only advantage I can see that
memmap=..,X@Y,W@Z needs not to be passed. But this is rather
short and static now and not huge depending on the e820 reserved
entries from the BIOS (and still obvious where it comes from and why
it gets passed. Passing things hidden in a modified e820 boot structure
is not a good idea).

Again, please consider to take these (after rebasing without 1/3).
If not I guess you have to explain me the advantages of passing
a mangled e820 table which I do oversee. If they do not convince me
I suggest we still take this or someone else has to touch the
kexec-tools parts.

   Thomas

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-01-30 16:39           ` Thomas Renninger
@ 2013-01-30 16:52             ` H. Peter Anvin
  2013-01-30 17:41               ` Yinghai Lu
  2013-01-30 18:52               ` Eric W. Biederman
  0 siblings, 2 replies; 35+ messages in thread
From: H. Peter Anvin @ 2013-01-30 16:52 UTC (permalink / raw)
  To: Thomas Renninger
  Cc: x86, kexec, Simon Horman, Eric W. Biederman, yinghai, vgoyal

On 01/30/2013 08:39 AM, Thomas Renninger wrote:
>
>>> This:
>>> - heavily cleans up the unnecesary reserved memory passing via memmap=
>>> - still provides a clean way of passing a valid e820 table through
>>>    boot structures (no Linux kernel made up e820 type passing)
>>> - Keeps complexity as low as possible and at one place and does not
>>>    involve kexec-tools as another error source (passing a badly
>>>    mangled e820 table or not being able to consider stuff the kernel
>>>    can when mangeling).
>
> If for some reason the e820 table in kdump case needs to be
> touched again, I am pretty sure you do not want to look up
> kexec-tools code.
> Also you won't be able to fix/workaround things in kexec-tools
> the way you can in the kernel.
>

Say what?  The kernel is an open source tool, so is kexec, and a *lot* 
of people have dependencies on specific kernel versions which does not

> I also do not think it's a good idea to pass an unspecified,
> Linux kernel made up e820 type through the public boot interface.

It's our interface.  We specify it.  On the other hand, we may want to 
make these negative numbers rather than starting at 128 to avoid future 
collisions with real memory types (not that they are growing very fast.)

> So looking from the other side, passing a modified e820 table
> only has disadvantages. The only advantage I can see that
> memmap=..,X@Y,W@Z needs not to be passed. But this is rather
> short and static now and not huge depending on the e820 reserved
> entries from the BIOS (and still obvious where it comes from and why
> it gets passed. Passing things hidden in a modified e820 boot structure
> is not a good idea).

The command line is fundamentally a human-oriented interface and its 
semantics change over time (considering the built-in command line stuff, 
for example.)  It is thus fragile.

The e820 map is fundamentally what you care about, and it has to be 
passed correctly anyway -- or your changes are utterly broken.  The 
modifications that have to be performed (from RAM to KDUMP) is trivial.

I have to admit to being rather confused as to the separation of various 
bits of kdump between the host kernel and various user-space components, 
but the whole use of the command line to pass the memory map seems just 
broken in light of everything that can go wrong.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-01-30 16:52             ` H. Peter Anvin
@ 2013-01-30 17:41               ` Yinghai Lu
  2013-01-30 18:52               ` Eric W. Biederman
  1 sibling, 0 replies; 35+ messages in thread
From: Yinghai Lu @ 2013-01-30 17:41 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: x86, kexec, Simon Horman, Eric W. Biederman, Thomas Renninger, vgoyal

On Wed, Jan 30, 2013 at 8:52 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> The e820 map is fundamentally what you care about, and it has to be passed
> correctly anyway -- or your changes are utterly broken.  The modifications
> that have to be performed (from RAM to KDUMP) is trivial.
>
> I have to admit to being rather confused as to the separation of various
> bits of kdump between the host kernel and various user-space components, but
> the whole use of the command line to pass the memory map seems just broken
> in light of everything that can go wrong.

Thomas,

Can you try to work out patch for  kexec-tools that kdump change all
ram to KDUMP_RESERVED,
and only make crask_kernel  in /proc/iomem to be RAM type?

then we only kernel patch for kdump will check KDUMP_REDREVED and RAM
type to get saved_max_pfn.

Thanks

Yinghai

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-01-30 16:52             ` H. Peter Anvin
  2013-01-30 17:41               ` Yinghai Lu
@ 2013-01-30 18:52               ` Eric W. Biederman
  2013-01-30 21:38                 ` H. Peter Anvin
  1 sibling, 1 reply; 35+ messages in thread
From: Eric W. Biederman @ 2013-01-30 18:52 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: x86, kexec, Simon Horman, yinghai, Thomas Renninger, vgoyal

"H. Peter Anvin" <hpa@zytor.com> writes:

> I have to admit to being rather confused as to the separation of various bits of
> kdump between the host kernel and various user-space components, but the whole
> use of the command line to pass the memory map seems just broken in light of
> everything that can go wrong.

It certainly seems time to look at this design decision and see if it
still makes sense.

The original idea was to pass the e820 map in the e820 variables, and to
slightly override that map to report which memory it was safe for the
kdump kernel to run in.  Leaving the kernel with the knowledge that
of where everything actually is, and that we just don't happen to be
using all of the memory.

Now something seems to have gone wrong with that strategy as we wound
up needing to play games with acpi and gart addresses.

I see one of two very basic options going forward.
- Pass a kernel command line option that just changes the kernels idea
  of which memory it can touch (and we can remove all of the other options).
- Modify the e820 map we pass to the kdump kernel and don't bother to
  pass anything on the command line.

Eric

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-01-30 18:52               ` Eric W. Biederman
@ 2013-01-30 21:38                 ` H. Peter Anvin
  2013-01-30 21:57                   ` Eric W. Biederman
  0 siblings, 1 reply; 35+ messages in thread
From: H. Peter Anvin @ 2013-01-30 21:38 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: x86, kexec, Simon Horman, yinghai, Thomas Renninger, vgoyal

On 01/30/2013 10:52 AM, Eric W. Biederman wrote:
> "H. Peter Anvin" <hpa@zytor.com> writes:
> 
>> I have to admit to being rather confused as to the separation of various bits of
>> kdump between the host kernel and various user-space components, but the whole
>> use of the command line to pass the memory map seems just broken in light of
>> everything that can go wrong.
> 
> It certainly seems time to look at this design decision and see if it
> still makes sense.
> 
> The original idea was to pass the e820 map in the e820 variables, and to
> slightly override that map to report which memory it was safe for the
> kdump kernel to run in.  Leaving the kernel with the knowledge that
> of where everything actually is, and that we just don't happen to be
> using all of the memory.
> 
> Now something seems to have gone wrong with that strategy as we wound
> up needing to play games with acpi and gart addresses.
> 
> I see one of two very basic options going forward.
> - Pass a kernel command line option that just changes the kernels idea
>   of which memory it can touch (and we can remove all of the other options).
> - Modify the e820 map we pass to the kdump kernel and don't bother to
>   pass anything on the command line.
> 

Yes, those seem to be the options, and we're currently discussing which one.

The second seems to make more sense to me.  The kexec tools build the
memory map anyway, and it makes sense to me at least to just build a
memory map with the appropriate regions marked as a dumpable type.

	-hpa



_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-01-30 21:38                 ` H. Peter Anvin
@ 2013-01-30 21:57                   ` Eric W. Biederman
  2013-01-30 22:10                     ` H. Peter Anvin
  0 siblings, 1 reply; 35+ messages in thread
From: Eric W. Biederman @ 2013-01-30 21:57 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: x86, kexec, Simon Horman, yinghai, Thomas Renninger, vgoyal

"H. Peter Anvin" <hpa@zytor.com> writes:

> On 01/30/2013 10:52 AM, Eric W. Biederman wrote:
>> "H. Peter Anvin" <hpa@zytor.com> writes:
>> 
>>> I have to admit to being rather confused as to the separation of various bits of
>>> kdump between the host kernel and various user-space components, but the whole
>>> use of the command line to pass the memory map seems just broken in light of
>>> everything that can go wrong.
>> 
>> It certainly seems time to look at this design decision and see if it
>> still makes sense.
>> 
>> The original idea was to pass the e820 map in the e820 variables, and to
>> slightly override that map to report which memory it was safe for the
>> kdump kernel to run in.  Leaving the kernel with the knowledge that
>> of where everything actually is, and that we just don't happen to be
>> using all of the memory.
>> 
>> Now something seems to have gone wrong with that strategy as we wound
>> up needing to play games with acpi and gart addresses.
>> 
>> I see one of two very basic options going forward.
>> - Pass a kernel command line option that just changes the kernels idea
>>   of which memory it can touch (and we can remove all of the other options).
>> - Modify the e820 map we pass to the kdump kernel and don't bother to
>>   pass anything on the command line.
>> 
>
> Yes, those seem to be the options, and we're currently discussing which one.
>
> The second seems to make more sense to me.  The kexec tools build the
> memory map anyway, and it makes sense to me at least to just build a
> memory map with the appropriate regions marked as a dumpable type.

This dumpable type doesn't make sense to me.  Are you suggesting making
regions that are memory but that we should not use a special memory
type?

I think I would prefer that to call that new type RESERVED_MEM or
RESERVED_CACHABLE.  Being more specific is fine but dumpable certainly
doesn't bring to mind what we are saying.  Especially since we already
communicate which areas were memory to the last kernel in an
architecture generic format.

Eric

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-01-30 21:57                   ` Eric W. Biederman
@ 2013-01-30 22:10                     ` H. Peter Anvin
  2013-01-30 22:29                       ` Eric W. Biederman
  0 siblings, 1 reply; 35+ messages in thread
From: H. Peter Anvin @ 2013-01-30 22:10 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: x86, kexec, Simon Horman, yinghai, Thomas Renninger, vgoyal

On 01/30/2013 01:57 PM, Eric W. Biederman wrote:
>>
>> Yes, those seem to be the options, and we're currently discussing which one.
>>
>> The second seems to make more sense to me.  The kexec tools build the
>> memory map anyway, and it makes sense to me at least to just build a
>> memory map with the appropriate regions marked as a dumpable type.
> 
> This dumpable type doesn't make sense to me.  Are you suggesting making
> regions that are memory but that we should not use a special memory
> type?

Yes.

> I think I would prefer that to call that new type RESERVED_MEM or
> RESERVED_CACHABLE.  Being more specific is fine but dumpable certainly
> doesn't bring to mind what we are saying.  Especially since we already
> communicate which areas were memory to the last kernel in an
> architecture generic format.

I was thinking that marking them differently might help debugging, at
least, but yes, we can have a RESERVED_MEM type.

However, Thomas does have a point that the current use of fairly small
positive values for Linux-defined types is a bad idea.  We should use
negative types, or at least something north of 0x40000000 or so.

	-hpa


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-01-30 22:10                     ` H. Peter Anvin
@ 2013-01-30 22:29                       ` Eric W. Biederman
  2013-01-30 22:41                         ` H. Peter Anvin
  2013-01-31  0:15                         ` Thomas Renninger
  0 siblings, 2 replies; 35+ messages in thread
From: Eric W. Biederman @ 2013-01-30 22:29 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: x86, kexec, Simon Horman, yinghai, Thomas Renninger, vgoyal

"H. Peter Anvin" <hpa@zytor.com> writes:

> On 01/30/2013 01:57 PM, Eric W. Biederman wrote:
>>>
>>> Yes, those seem to be the options, and we're currently discussing which one.
>>>
>>> The second seems to make more sense to me.  The kexec tools build the
>>> memory map anyway, and it makes sense to me at least to just build a
>>> memory map with the appropriate regions marked as a dumpable type.
>> 
>> This dumpable type doesn't make sense to me.  Are you suggesting making
>> regions that are memory but that we should not use a special memory
>> type?
>
> Yes.
>
>> I think I would prefer that to call that new type RESERVED_MEM or
>> RESERVED_CACHABLE.  Being more specific is fine but dumpable certainly
>> doesn't bring to mind what we are saying.  Especially since we already
>> communicate which areas were memory to the last kernel in an
>> architecture generic format.
>
> I was thinking that marking them differently might help debugging, at
> least, but yes, we can have a RESERVED_MEM type.
>
> However, Thomas does have a point that the current use of fairly small
> positive values for Linux-defined types is a bad idea.  We should use
> negative types, or at least something north of 0x40000000 or so.

Yes.  It doesn't much matter in the kernel but when it because part of
the ABI it is a real issue.

Since old kernels treat any value they don't understand as reserved
passing a modified e820 map seems reasonable to me once we have reserved
a special linux value for it.

Eric


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-01-30 22:29                       ` Eric W. Biederman
@ 2013-01-30 22:41                         ` H. Peter Anvin
  2013-01-30 22:49                           ` Yinghai Lu
  2013-01-31  0:15                         ` Thomas Renninger
  1 sibling, 1 reply; 35+ messages in thread
From: H. Peter Anvin @ 2013-01-30 22:41 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: x86, kexec, Simon Horman, yinghai, Thomas Renninger, vgoyal

On 01/30/2013 02:29 PM, Eric W. Biederman wrote:
>>
>>> I think I would prefer that to call that new type RESERVED_MEM or
>>> RESERVED_CACHABLE.  Being more specific is fine but dumpable certainly
>>> doesn't bring to mind what we are saying.  Especially since we already
>>> communicate which areas were memory to the last kernel in an
>>> architecture generic format.
>>
>> I was thinking that marking them differently might help debugging, at
>> least, but yes, we can have a RESERVED_MEM type.
>>
>> However, Thomas does have a point that the current use of fairly small
>> positive values for Linux-defined types is a bad idea.  We should use
>> negative types, or at least something north of 0x40000000 or so.
> 
> Yes.  It doesn't much matter in the kernel but when it because part of
> the ABI it is a real issue.
> 
> Since old kernels treat any value they don't understand as reserved
> passing a modified e820 map seems reasonable to me once we have reserved
> a special linux value for it.
> 

Just to prevent the possible funnies (including collisions with -errno)
that might be caused by negative numbers, I suggest we assign
Linux-specific values starting at some huge but still positive value
like 2000000000 -- that way we avoid any possible uses of negative errno
values internally in the kernel.

The bigger question is if we need a separate value from the current
E820_RESERVED_KERN.  Since it is always easier to have multiple values
with the same semantics than it is to have too few, I would still prefer
we added a new E820_RESERVED_KDUMP, which would then be 2000000001.

What do you think?

	-hpa


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-01-30 22:41                         ` H. Peter Anvin
@ 2013-01-30 22:49                           ` Yinghai Lu
  0 siblings, 0 replies; 35+ messages in thread
From: Yinghai Lu @ 2013-01-30 22:49 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: x86, kexec, Simon Horman, Eric W. Biederman, Thomas Renninger, vgoyal

On Wed, Jan 30, 2013 at 2:41 PM, H. Peter Anvin <hpa@zytor.com> wrote:
> On 01/30/2013 02:29 PM, Eric W. Biederman wrote:
> The bigger question is if we need a separate value from the current
> E820_RESERVED_KERN.  Since it is always easier to have multiple values
> with the same semantics than it is to have too few, I would still prefer
> we added a new E820_RESERVED_KDUMP, which would then be 2000000001.

current for E820_RESERVED_KERN: during filling memblock.memory,
it will be treated as E820_RAM, and also memblock.reserve already have entries
for them.

For E820_RESERVED_KDUMP, looks only usage is for kernel to find saved_max_pfn.

So we may have to separate them.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-01-30 22:29                       ` Eric W. Biederman
  2013-01-30 22:41                         ` H. Peter Anvin
@ 2013-01-31  0:15                         ` Thomas Renninger
  2013-01-31  0:18                           ` H. Peter Anvin
  2013-02-06 15:23                           ` Thomas Renninger
  1 sibling, 2 replies; 35+ messages in thread
From: Thomas Renninger @ 2013-01-31  0:15 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: x86, kexec, Simon Horman, H. Peter Anvin, yinghai, vgoyal

On Wednesday, January 30, 2013 02:29:04 PM Eric W. Biederman wrote:
> "H. Peter Anvin" <hpa@zytor.com> writes:
> > On 01/30/2013 01:57 PM, Eric W. Biederman wrote:
> >>> Yes, those seem to be the options, and we're currently discussing which
> >>> one.
> >>> 
> >>> The second seems to make more sense to me.  The kexec tools build the
> >>> memory map anyway, and it makes sense to me at least to just build a
> >>> memory map with the appropriate regions marked as a dumpable type.
> >> 
> >> This dumpable type doesn't make sense to me.  Are you suggesting making
> >> regions that are memory but that we should not use a special memory
> >> type?
> > 
> > Yes.
> > 
> >> I think I would prefer that to call that new type RESERVED_MEM or
> >> RESERVED_CACHABLE.  Being more specific is fine but dumpable certainly
> >> doesn't bring to mind what we are saying.  Especially since we already
> >> communicate which areas were memory to the last kernel in an
> >> architecture generic format.
> > 
> > I was thinking that marking them differently might help debugging, at
> > least, but yes, we can have a RESERVED_MEM type.
> > 
> > However, Thomas does have a point that the current use of fairly small
> > positive values for Linux-defined types is a bad idea.  We should use
> > negative types, or at least something north of 0x40000000 or so.
> 
> Yes.  It doesn't much matter in the kernel but when it because part of
> the ABI it is a real issue.
That's one point (self made up e820 type should better be kept kernel
internal).
 
> Since old kernels treat any value they don't understand as reserved
> passing a modified e820 map seems reasonable to me once we have reserved
> a special linux value for it.
The other one: Why should several instances modify the e820 table
if this is not necessary?

I guess both ways are a huge enhancement compared to what we have now.
Which approach to finally take should not matter that much, but because
of above I still prefer to go this way:
- Pass a kernel command line option that just changes the kernels idea
  of which memory it can touch

Whether the value(s) of these types should be ramped up is a different
discussion then. If they still should have bigger values, this can be
addressed by a separate patch now or later to whatever you like to see.

   Thomas

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-01-31  0:15                         ` Thomas Renninger
@ 2013-01-31  0:18                           ` H. Peter Anvin
  2013-01-31  9:11                             ` Thomas Renninger
  2013-02-06 15:23                           ` Thomas Renninger
  1 sibling, 1 reply; 35+ messages in thread
From: H. Peter Anvin @ 2013-01-31  0:18 UTC (permalink / raw)
  To: Thomas Renninger
  Cc: x86, kexec, Simon Horman, Eric W. Biederman, yinghai, vgoyal

On 01/30/2013 04:15 PM, Thomas Renninger wrote:
> 
> I guess both ways are a huge enhancement compared to what we have now.
> Which approach to finally take should not matter that much, but because
> of above I still prefer to go this way:
> - Pass a kernel command line option that just changes the kernels idea
>   of which memory it can touch
> 

The kernel command line is a human-oriented data structure with limited
size and fairly complex semantics.

	-hpa


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-01-31  0:18                           ` H. Peter Anvin
@ 2013-01-31  9:11                             ` Thomas Renninger
  0 siblings, 0 replies; 35+ messages in thread
From: Thomas Renninger @ 2013-01-31  9:11 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: x86, kexec, Simon Horman, Eric W. Biederman, yinghai, vgoyal

On Thursday, January 31, 2013 01:18:29 AM H. Peter Anvin wrote:
> On 01/30/2013 04:15 PM, Thomas Renninger wrote:
> > 
> > I guess both ways are a huge enhancement compared to what we have now.
> > Which approach to finally take should not matter that much, but
> > because of above I still prefer to go this way:
> > - Pass a kernel command line option that just changes the kernels idea
> >   of which memory it can touch
> > 
> 
> The kernel command line is a human-oriented data structure with limited
> size
Size doesn't matter. Before (passing all reserved areas) it surely did.
Now the passed area(s) should be of static size (2 areas) and with 
Yinghai's comma seperated memmap= extension needed size is cut even
further.
The commandline size got extended some time ago and as soon as a distro
will hit the limit for whatever reasons, I expect it can get extended 
again. But this will certainly not be because of this kdump option.
The resume= param could be cut out by kexec-tools fwiw:
resume=/dev/disk/by-id/ata-Hitachi_HDS721016CLA382_JPAB40HM2KUK6B-part2

> and fairly complex semantics.
This interface works for quite some time and always will.

Above may be valid arguments, but the reasons for passing the kdump
memory area via boot parameter outweight these. The recent
discussion about:
> Just to prevent the possible funnies (including collisions with -errno)
> that might be caused by negative numbers,
underlines this.
To be honest when/why this could happen I do not understand in detail.
But it seems obvious to me that if this self made up e820 type can be
kept kernel internal, it should be done.

So if there isn't another really strong argument against it, I'd
like to resend my work rebased without 1/3.

If the e820 type values should still be modified to whatever value,
this should certainly go in separately with a good changelog
explaining why (which I cannot make up).

   Thomas

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-01-31  0:15                         ` Thomas Renninger
  2013-01-31  0:18                           ` H. Peter Anvin
@ 2013-02-06 15:23                           ` Thomas Renninger
  2013-02-06 23:04                             ` Eric W. Biederman
  1 sibling, 1 reply; 35+ messages in thread
From: Thomas Renninger @ 2013-02-06 15:23 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: x86, kexec, Simon Horman, H. Peter Anvin, yinghai, vgoyal

On Thursday, January 31, 2013 01:15:34 AM Thomas Renninger wrote:
> On Wednesday, January 30, 2013 02:29:04 PM Eric W. Biederman wrote:
> > "H. Peter Anvin" <hpa@zytor.com> writes:
> > > On 01/30/2013 01:57 PM, Eric W. Biederman wrote:
> > >>> Yes, those seem to be the options, and we're currently discussing which
> > >>> one.
> > >>> 
> > >>> The second seems to make more sense to me.  The kexec tools build the
> > >>> memory map anyway, and it makes sense to me at least to just build a
> > >>> memory map with the appropriate regions marked as a dumpable type.
> > >> 
> > >> This dumpable type doesn't make sense to me.  Are you suggesting making
> > >> regions that are memory but that we should not use a special memory
> > >> type?
> > > 
> > > Yes.
> > > 
> > >> I think I would prefer that to call that new type RESERVED_MEM or
> > >> RESERVED_CACHABLE.  Being more specific is fine but dumpable certainly
> > >> doesn't bring to mind what we are saying.  Especially since we already
> > >> communicate which areas were memory to the last kernel in an
> > >> architecture generic format.
> > > 
> > > I was thinking that marking them differently might help debugging, at
> > > least, but yes, we can have a RESERVED_MEM type.
> > > 
> > > However, Thomas does have a point that the current use of fairly small
> > > positive values for Linux-defined types is a bad idea.  We should use
> > > negative types, or at least something north of 0x40000000 or so.
> > 
> > Yes.  It doesn't much matter in the kernel but when it because part of
> > the ABI it is a real issue.
> That's one point (self made up e820 type should better be kept kernel
> internal).

There is another important point, why the command line approach
should be preferred:
Backward compatibility and the ability to backport the whole stuff to
fix mmconf in kdump which would be nice for example for SLES11.

kexec-tools can detect the kernel version of the kernel which is loaded
as kdump/crash kernel. If its version is:
"$MAINLINE_VERSION_THE_CHANGE_GETS_INTRODUCED"
or newer, things are fine.
But if the kernel version is older, there is no way for kexec-tools to
find out whether the older kernel may have the feature included.
That's bad!

In case of the command line apprach kexec-tools can pass the whole memmap=
mess as passed before, plus the new format: memmap=kdump_reserve_usable,X@Y.
In older kernels the newly formatted string will get passed to:
memmparse("kdump_reserve_usable,X@Y")
and the memmap early_param function will return with -EINVAL:
        mem_size = memparse(p, &p);                                                                                            
        if (p == oldp)                                                                                                         
                return -EINVAL;

Ok, the kdump kernel which does not have the stuff backported would issue a:
printk(KERN_WARNING "Malformed early option '%s'\n", param);
which could get ignored. I guess this is fine compared to any other backport
nightmare approach.

   Thomas

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-02-06 15:23                           ` Thomas Renninger
@ 2013-02-06 23:04                             ` Eric W. Biederman
  2013-02-06 23:11                               ` H. Peter Anvin
  0 siblings, 1 reply; 35+ messages in thread
From: Eric W. Biederman @ 2013-02-06 23:04 UTC (permalink / raw)
  To: Thomas Renninger
  Cc: x86, kexec, Simon Horman, H. Peter Anvin, yinghai, vgoyal

Thomas Renninger <trenn@suse.de> writes:

> On Thursday, January 31, 2013 01:15:34 AM Thomas Renninger wrote:
>> On Wednesday, January 30, 2013 02:29:04 PM Eric W. Biederman wrote:
>> > "H. Peter Anvin" <hpa@zytor.com> writes:
>> > > On 01/30/2013 01:57 PM, Eric W. Biederman wrote:
>> > >>> Yes, those seem to be the options, and we're currently discussing which
>> > >>> one.
>> > >>> 
>> > >>> The second seems to make more sense to me.  The kexec tools build the
>> > >>> memory map anyway, and it makes sense to me at least to just build a
>> > >>> memory map with the appropriate regions marked as a dumpable type.
>> > >> 
>> > >> This dumpable type doesn't make sense to me.  Are you suggesting making
>> > >> regions that are memory but that we should not use a special memory
>> > >> type?
>> > > 
>> > > Yes.
>> > > 
>> > >> I think I would prefer that to call that new type RESERVED_MEM or
>> > >> RESERVED_CACHABLE.  Being more specific is fine but dumpable certainly
>> > >> doesn't bring to mind what we are saying.  Especially since we already
>> > >> communicate which areas were memory to the last kernel in an
>> > >> architecture generic format.
>> > > 
>> > > I was thinking that marking them differently might help debugging, at
>> > > least, but yes, we can have a RESERVED_MEM type.
>> > > 
>> > > However, Thomas does have a point that the current use of fairly small
>> > > positive values for Linux-defined types is a bad idea.  We should use
>> > > negative types, or at least something north of 0x40000000 or so.
>> > 
>> > Yes.  It doesn't much matter in the kernel but when it because part of
>> > the ABI it is a real issue.
>> That's one point (self made up e820 type should better be kept kernel
>> internal).
>
> There is another important point, why the command line approach
> should be preferred:
> Backward compatibility and the ability to backport the whole stuff to
> fix mmconf in kdump which would be nice for example for SLES11.

Backward compatibility argues for editing the e820 map because we can do
that at any time, with no dependencies on any kernel changes.  Only
the E820_RAM type will be treated as ram.  Any unregcognized e820 type
will be treated as reserved.  The code has always been like that.

A new reserved value would be nice to communicate to the kernel areas
that are really ram but it isn't allowed to touch but is unnecessary at
this point.  Even with just marking memory regions we don't use as
E820_RESERVED we match what is currently being done.

Since a new reserved value has not been selected let me suggest.
0x6b646d70 aka kdmp in asii.

For backwards compatibility I prefer editing the e820 map in
/sbin/kexec.


My real preference would be to define a command line option that will
work on all architectures that implement kdump, as the craskernel option
does.  Unfortunately it looks like that ship has sailed, and there isn't
enough desire to fix this to come up with a generic option that will
work on more than just x86.  But if we could get past the kernel
versioning and figure out a arch-generic solution it might be worth it.

> kexec-tools can detect the kernel version of the kernel which is loaded
> as kdump/crash kernel. If its version is:
> "$MAINLINE_VERSION_THE_CHANGE_GETS_INTRODUCED"
> or newer, things are fine.
> But if the kernel version is older, there is no way for kexec-tools to
> find out whether the older kernel may have the feature included.
> That's bad!

That is totally unnecessary for the e820 map because anything
unrecognized is treated as reserved, and for the sufficiently paranoid
we don't need to use a new memory type.

> In case of the command line apprach kexec-tools can pass the whole memmap=
> mess as passed before, plus the new format: memmap=kdump_reserve_usable,X@Y.
> In older kernels the newly formatted string will get passed to:
> memmparse("kdump_reserve_usable,X@Y")
> and the memmap early_param function will return with -EINVAL:
>         mem_size = memparse(p, &p);
>         if (p == oldp)
>                 return -EINVAL;

Except that is totally silly as we will be giving the kernel conflicting
directions which will be horribly ugly to parse.   We have to look at
either the kernel version or the boot protocol and decide if we can use
the new thing.

> Ok, the kdump kernel which does not have the stuff backported would issue a:
> printk(KERN_WARNING "Malformed early option '%s'\n", param);
> which could get ignored. I guess this is fine compared to any other backport
> nightmare approach.

The existing e820 handling for unknown type is much much better.  It
just treats them as reserved and goes about it's merry way.

Eric


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-02-06 23:04                             ` Eric W. Biederman
@ 2013-02-06 23:11                               ` H. Peter Anvin
  2013-02-06 23:39                                 ` Eric W. Biederman
  0 siblings, 1 reply; 35+ messages in thread
From: H. Peter Anvin @ 2013-02-06 23:11 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: x86, kexec, Simon Horman, yinghai, Thomas Renninger, vgoyal

On 02/06/2013 03:04 PM, Eric W. Biederman wrote:
>>
>> There is another important point, why the command line approach
>> should be preferred:
>> Backward compatibility and the ability to backport the whole stuff to
>> fix mmconf in kdump which would be nice for example for SLES11.
> 
> Backward compatibility argues for editing the e820 map because we can do
> that at any time, with no dependencies on any kernel changes.  Only
> the E820_RAM type will be treated as ram.  Any unregcognized e820 type
> will be treated as reserved.  The code has always been like that.
> 
> A new reserved value would be nice to communicate to the kernel areas
> that are really ram but it isn't allowed to touch but is unnecessary at
> this point.  Even with just marking memory regions we don't use as
> E820_RESERVED we match what is currently being done.
> 
> Since a new reserved value has not been selected let me suggest.
> 0x6b646d70 aka kdmp in asii.
> 

I (somewhat) would like to keep the reserved numbers in a small(ish)
range which argue against that specific constant.  I kind of like
0x6bxxxxxx ("k") though, it has some flair to it.

> For backwards compatibility I prefer editing the e820 map in
> /sbin/kexec.
> 
> 
> My real preference would be to define a command line option that will
> work on all architectures that implement kdump, as the craskernel option
> does.  Unfortunately it looks like that ship has sailed, and there isn't
> enough desire to fix this to come up with a generic option that will
> work on more than just x86.  But if we could get past the kernel
> versioning and figure out a arch-generic solution it might be worth it.
> 

What would that option look like?

>> kexec-tools can detect the kernel version of the kernel which is loaded
>> as kdump/crash kernel. If its version is:
>> "$MAINLINE_VERSION_THE_CHANGE_GETS_INTRODUCED"
>> or newer, things are fine.
>> But if the kernel version is older, there is no way for kexec-tools to
>> find out whether the older kernel may have the feature included.
>> That's bad!
> 
> That is totally unnecessary for the e820 map because anything
> unrecognized is treated as reserved, and for the sufficiently paranoid
> we don't need to use a new memory type.

The only issue is if kdump needs the memory it is going to dump to be
mapped; we don't map reserved memory anymore unless explicitly requested
via ioremap().  Does it?

> The existing e820 handling for unknown type is much much better.  It
> just treats them as reserved and goes about it's merry way.

It sounds like this is the way to go.

	-hpa


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-02-06 23:11                               ` H. Peter Anvin
@ 2013-02-06 23:39                                 ` Eric W. Biederman
  2013-02-08 20:08                                   ` Thomas Renninger
  0 siblings, 1 reply; 35+ messages in thread
From: Eric W. Biederman @ 2013-02-06 23:39 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: x86, kexec, Simon Horman, yinghai, Thomas Renninger, vgoyal

"H. Peter Anvin" <hpa@zytor.com> writes:

> On 02/06/2013 03:04 PM, Eric W. Biederman wrote:
>>>
>>> There is another important point, why the command line approach
>>> should be preferred:
>>> Backward compatibility and the ability to backport the whole stuff to
>>> fix mmconf in kdump which would be nice for example for SLES11.
>> 
>> Backward compatibility argues for editing the e820 map because we can do
>> that at any time, with no dependencies on any kernel changes.  Only
>> the E820_RAM type will be treated as ram.  Any unregcognized e820 type
>> will be treated as reserved.  The code has always been like that.
>> 
>> A new reserved value would be nice to communicate to the kernel areas
>> that are really ram but it isn't allowed to touch but is unnecessary at
>> this point.  Even with just marking memory regions we don't use as
>> E820_RESERVED we match what is currently being done.
>> 
>> Since a new reserved value has not been selected let me suggest.
>> 0x6b646d70 aka kdmp in asii.
>> 
>
> I (somewhat) would like to keep the reserved numbers in a small(ish)
> range which argue against that specific constant.  I kind of like
> 0x6bxxxxxx ("k") though, it has some flair to it.

Well if someone doesn't reserve such a constant in a well know place the
historical solution is to pick a random number and hope you don't
collide with someone else's random number.  We are pretty close to that
right now with the e820 map.

And coming up sometime soonish is how do we do this for the efi memory
map.

We do need to regenerate the map in /sbin/kexec though to handle
the case of memory hotplug (which necessitates reloading our crash
kernel).

>> For backwards compatibility I prefer editing the e820 map in
>> /sbin/kexec.
>> 
>> 
>> My real preference would be to define a command line option that will
>> work on all architectures that implement kdump, as the craskernel option
>> does.  Unfortunately it looks like that ship has sailed, and there isn't
>> enough desire to fix this to come up with a generic option that will
>> work on more than just x86.  But if we could get past the kernel
>> versioning and figure out a arch-generic solution it might be worth it.
>> 
>
> What would that option look like?

Probably something like "usemem=<size>@<addr>,..."

>>> kexec-tools can detect the kernel version of the kernel which is loaded
>>> as kdump/crash kernel. If its version is:
>>> "$MAINLINE_VERSION_THE_CHANGE_GETS_INTRODUCED"
>>> or newer, things are fine.
>>> But if the kernel version is older, there is no way for kexec-tools to
>>> find out whether the older kernel may have the feature included.
>>> That's bad!
>> 
>> That is totally unnecessary for the e820 map because anything
>> unrecognized is treated as reserved, and for the sufficiently paranoid
>> we don't need to use a new memory type.
>
> The only issue is if kdump needs the memory it is going to dump to be
> mapped; we don't map reserved memory anymore unless explicitly requested
> via ioremap().  Does it?

I don't think that it makes sense for the memory to be permanently
mapped.  Even at 4MB per terabyte with 2M pages for the bigger systems
that becomes a noticable amount of our memory to reserve for kdump.

In the general picture we do need to track the memory so that we
remember how the memory should be cached or we run into the possibility
of getting the caching bits set into an inconsistent state.

There is presently work to modify /dev/oldmem and /proc/vmcore so
that they are mmapable, so that userspace can control how much is
ioremapped at once.  As currently on the larger systems there is
major performance problem with mapping a single page at a time and
copying that to userspace.

>> The existing e820 handling for unknown type is much much better.  It
>> just treats them as reserved and goes about it's merry way.
>
> It sounds like this is the way to go.

It certainly looks good.  We still need someone with the time to write
the patch and test it.

Eric

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-02-06 23:39                                 ` Eric W. Biederman
@ 2013-02-08 20:08                                   ` Thomas Renninger
  2013-02-08 20:25                                     ` Eric W. Biederman
  0 siblings, 1 reply; 35+ messages in thread
From: Thomas Renninger @ 2013-02-08 20:08 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: x86, kexec, Simon Horman, H. Peter Anvin, yinghai, vgoyal

On Wednesday, February 06, 2013 03:39:50 PM Eric W. Biederman wrote:
> "H. Peter Anvin" <hpa@zytor.com> writes:
...
> >> My real preference would be to define a command line option that will
> >> work on all architectures that implement kdump, as the craskernel option
> >> does.  Unfortunately it looks like that ship has sailed, and there isn't
> >> enough desire to fix this to come up with a generic option that will
> >> work on more than just x86.  But if we could get past the kernel
> >> versioning and figure out a arch-generic solution it might be worth it.
> > 
> > What would that option look like?
> 
> Probably something like "usemem=<size>@<addr>,..."
If the e820 table approach is taken, x86 would not need any such
parameter at all anymore?
All the memmap= stuff can vanish only the elfcorehdr= param remains.
 
...
> >> The existing e820 handling for unknown type is much much better.  It
> >> just treats them as reserved and goes about it's merry way.
If the new kdump type is treated as reserved and things work out,
I agree that this would be the most elegant approach, especially also
for backporting etc.
In a kernel which has the patch/functionality backported I would do
it like this then:
  - If the special kdump e820 type shows up, all memmap options from
    memmap=exactmap on are ignored and the kexec-tools passed
    e820 table is used just as it is.
    -> This would still allow e820 modifcations through memmap=
    if passed manually for debugging, they just have to show up before
    the kexec-tools generated ones. Anyway, I will also send a patch
    how I think this can be backported and still work with old and new
    kexec-tools versions.

> > It sounds like this is the way to go.
> 
> It certainly looks good.  We still need someone with the time to write
> the patch and test it.

I try to find time for this early next week to code something together and
already give it some testing, but I cannot promise anything.

Thanks everybody for the help to find the best solution,

  Thomas

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-02-08 20:08                                   ` Thomas Renninger
@ 2013-02-08 20:25                                     ` Eric W. Biederman
  2013-02-08 20:56                                       ` Thomas Renninger
  0 siblings, 1 reply; 35+ messages in thread
From: Eric W. Biederman @ 2013-02-08 20:25 UTC (permalink / raw)
  To: Thomas Renninger
  Cc: x86, kexec, Simon Horman, H. Peter Anvin, yinghai, vgoyal

Thomas Renninger <trenn@suse.de> writes:

> On Wednesday, February 06, 2013 03:39:50 PM Eric W. Biederman wrote:
>> "H. Peter Anvin" <hpa@zytor.com> writes:

>> >> The existing e820 handling for unknown type is much much better.  It
>> >> just treats them as reserved and goes about it's merry way.
> If the new kdump type is treated as reserved and things work out,
> I agree that this would be the most elegant approach, especially also
> for backporting etc.

Only kexec-tools needs the functionality so no kernel level backporting
is necessary.

> In a kernel which has the patch/functionality backported I would do
> it like this then:
>   - If the special kdump e820 type shows up, all memmap options from
>     memmap=exactmap on are ignored and the kexec-tools passed
>     e820 table is used just as it is.
>     -> This would still allow e820 modifcations through memmap=
>     if passed manually for debugging, they just have to show up before
>     the kexec-tools generated ones. Anyway, I will also send a patch
>     how I think this can be backported and still work with old and new
>     kexec-tools versions.

Way over complicated.  kexec-tools can just stop passing the memmap=
options entirely for every kernel.

We have not actually identified a use that the kernel would make of the
reserved areas.  Comming up with a new mapping type is just hedging our
bets in case there is a reason we want to know what is actually RAM at
some point in the future.

>> > It sounds like this is the way to go.
>> 
>> It certainly looks good.  We still need someone with the time to write
>> the patch and test it.
>
> I try to find time for this early next week to code something together and
> already give it some testing, but I cannot promise anything.
>
> Thanks everybody for the help to find the best solution,

Eric

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage
  2013-02-08 20:25                                     ` Eric W. Biederman
@ 2013-02-08 20:56                                       ` Thomas Renninger
  0 siblings, 0 replies; 35+ messages in thread
From: Thomas Renninger @ 2013-02-08 20:56 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: x86, kexec, Simon Horman, H. Peter Anvin, yinghai, vgoyal

On Friday, February 08, 2013 12:25:14 PM Eric W. Biederman wrote:
> Thomas Renninger <trenn@suse.de> writes:
...
> > In a kernel which has the patch/functionality backported I would do
> > 
> > it like this then:
> >   - If the special kdump e820 type shows up, all memmap options from
> >   
> >     memmap=exactmap on are ignored and the kexec-tools passed
> >     e820 table is used just as it is.
> >     -> This would still allow e820 modifcations through memmap=
> >     if passed manually for debugging, they just have to show up before
> >     the kexec-tools generated ones. Anyway, I will also send a patch
> >     how I think this can be backported and still work with old and new
> >     kexec-tools versions.
> 
> Way over complicated.  kexec-tools can just stop passing the memmap=
> options entirely for every kernel.
Ah yes, sure. This is really nice.
I hope things work out like that, I'll report back next week.

Have a nice week-end,

  Thomas

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2013-02-08 20:56 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-22 15:02 [PATCH 0/3] Make use of new memmap= kernel parameter syntax Thomas Renninger
2013-01-22 15:02 ` [PATCH 1/3] kexec: Split kernel_version() to also be able to pass a release string Thomas Renninger
2013-01-22 15:02 ` [PATCH 2/3] kexec x86: Extract kernel version and convert it to KERNEL_VERSION() style Thomas Renninger
2013-01-22 15:02 ` [PATCH 3/3] kexec x86: Make kexec aware of new memmap= kernel parameter possibilities Thomas Renninger
2013-01-30  4:31 ` [PATCH 0/3] Make use of new memmap= kernel parameter syntax Simon Horman
2013-01-30  5:40   ` H. Peter Anvin
2013-01-30  5:52     ` Simon Horman
2013-01-30 16:03     ` Thomas Renninger
2013-01-30 16:06       ` [PATCH 1/3] x86 e820: Check for exactmap appearance when parsing first memmap option Thomas Renninger
2013-01-30 16:09         ` H. Peter Anvin
2013-01-30 16:08       ` [PATCH 2/3] x86: Introduce Linux kernel specific E820_RESERVED_KDUMP e820 memory range type Thomas Renninger
2013-01-30 16:10       ` [PATCH 3/3] x86 e820: Introduce memmap=kdump_reserve_usable for kdump usage Thomas Renninger
2013-01-30 16:10       ` [PATCH 0/3] Make use of new memmap= kernel parameter syntax H. Peter Anvin
2013-01-30 16:13       ` [PATCH 0/3] Cleanup kdump memmap= passing and e820 usage Thomas Renninger
2013-01-30 16:16         ` H. Peter Anvin
2013-01-30 16:39           ` Thomas Renninger
2013-01-30 16:52             ` H. Peter Anvin
2013-01-30 17:41               ` Yinghai Lu
2013-01-30 18:52               ` Eric W. Biederman
2013-01-30 21:38                 ` H. Peter Anvin
2013-01-30 21:57                   ` Eric W. Biederman
2013-01-30 22:10                     ` H. Peter Anvin
2013-01-30 22:29                       ` Eric W. Biederman
2013-01-30 22:41                         ` H. Peter Anvin
2013-01-30 22:49                           ` Yinghai Lu
2013-01-31  0:15                         ` Thomas Renninger
2013-01-31  0:18                           ` H. Peter Anvin
2013-01-31  9:11                             ` Thomas Renninger
2013-02-06 15:23                           ` Thomas Renninger
2013-02-06 23:04                             ` Eric W. Biederman
2013-02-06 23:11                               ` H. Peter Anvin
2013-02-06 23:39                                 ` Eric W. Biederman
2013-02-08 20:08                                   ` Thomas Renninger
2013-02-08 20:25                                     ` Eric W. Biederman
2013-02-08 20:56                                       ` Thomas Renninger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.