All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yinghai Lu <yinghai@kernel.org>
To: Thomas Renninger <trenn@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>,
	MUNEDA Takahiro <muneda.takahiro@jp.fujitsu.com>,
	Takao Indoh <indou.takao@jp.fujitsu.com>,
	linux-pci@vger.kernel.org, x86@kernel.org,
	linux-kernel@vger.kernel.org, andi@firstfloor.org,
	tokunaga.keiich@jp.fujitsu.com, kexec@lists.infradead.org,
	hbabu@us.ibm.com, mingo@redhat.com, ddutile@redhat.com,
	vgoyal@redhat.com, ishii.hironobu@jp.fujitsu.com,
	bhelgaas@google.com, tglx@linutronix.de, khalid@gonehiking.org,
	horms@verge.net.au
Subject: Re: [PATCH] x86 e820: only void usable memory areas in memmap=exactmap case
Date: Sun, 13 Jan 2013 18:43:46 -0800	[thread overview]
Message-ID: <CAE9FiQV+W21WcZuuWJYJKXhN4sf0mi=iAFgjX644c3EytPXptw@mail.gmail.com> (raw)
In-Reply-To: <2741365.qqWvlU8HGl@hammer82.arch.suse.de>

[-- Attachment #1: Type: text/plain, Size: 3628 bytes --]

On Sun, Jan 13, 2013 at 6:08 PM, Thomas Renninger <trenn@suse.de> wrote:
> On Saturday, January 12, 2013 09:07:12 AM Yinghai Lu wrote:
>> On Sat, Jan 12, 2013 at 3:31 AM, Thomas Renninger <trenn@suse.de> wrote:
>> >         memmap=exactmap [KNL,X86] Enable setting of an exact
>> >
>> > -                       E820 memory map, as specified by the user.
>> > -                       Such memmap=exactmap lines can be constructed
>> > based on -                       BIOS output or other requirements. See
>> > the memmap=nn@ss -                       option description.
>> > +                       E820 usable memory map, as specified by the user.
>> > +                       All unusable (reserved, ACPI, NVS,...) ranges from
>> > the +                       original e820 table are preserved.
>> > +                       But the usable memory regions from the original
>> > e820 +                       table are removed.
>> > +                       This parameter is explicitly for kdump usage:
>> > +                       The memory the kdump kernel is allowed to use must
>> > +                       be passed via below memmap=nn[KMG]@ss[KMG] param.
>> > +                       All reserved regions the kernel may use for
>> > ioremapping +                       and similar are still considered.
>> > +
>> > +       memmap=voidmap  [KNL,X86] Do not use any e820 ranges from BIOS or
>> > +                       bootloader. Instead you have to pass regions via
>> > +                       below memmap= options.
>>
>> I would suggest to keep memmap=exactmap meaning not changed, and add
>> memmap=exactusablemap
>> instead.
> I disagree.
> I would like to change memmap=exactmap behavior.
>
> Why:
> 1)
> This is a sever bug (for kdump/kexec). I could imagine quite a lot
> kdump related bugs get silently solved by this fix.
> I expect we agree that the change, however it looks like in the end, should
> still get in in this kernel round?
> With my approach, I would also suggest to spread this to stable kernels.
> -> No need to update/patch kexec-tools, things will just work as they should.
>
> 2)
> I would introduce 2 new memmap= options. However they look like, for example:
> =void_usable_map
> =void_orig_map
> and deprecated exactmap= via
> printk(KERN_INFO "exactmap usage changed and is deprecated\n");
> but still fix it.
> Latest kexec-tools will just use memmap=void_usable_map, old ones
> are still fixed via stable kernel updates.

everyone could understand it straightforward:
exactmap:  memmap will be specified, and it should be honored without
considering old any memmap.
exactusablemap: will make sure only old ram range get removed, and
specified usable ranges will become final usable
ranges in the final memmap. so we need to remove overlapping to old
reserved ranges.

aka: exact means EXACT ...

>
>
>> kexec-tools could be updated to support exactusablemap with
>> kernelversion checking for kdump.
> No, please not. It will be a maintenance/compatibility issue that will remain
> for years or ever in kdump and/or the kernel and it's not necessary.

I don't see there is any problem with it.

those just some kind of improvement without considering kdump.
because kdump/scripts already does good job to provide right exactmap
with usable and acpi reserved areas.
for mmconf, some system that range reserved in e820, and some have that in ACPI.
and those systems will have that mmconf enabled in kdumped kernel.

attached is what I like to have with exactusablemap, but maybe is not
needed, and we can just stay with exactmap...

ps: we don't need to add e820_remove_type...

Yinghai

[-- Attachment #2: e820_exactusablemap.patch --]
[-- Type: application/octet-stream, Size: 1819 bytes --]

---
 arch/x86/kernel/e820.c |   20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

Index: linux-2.6/arch/x86/kernel/e820.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/e820.c
+++ linux-2.6/arch/x86/kernel/e820.c
@@ -836,6 +836,7 @@ static int __init parse_memopt(char *p)
 early_param("mem", parse_memopt);
 
 static bool __initdata exactmap_parsed;
+static bool __initdata exactusablemap_parsed;
 
 static int __init parse_memmap_one(char *p)
 {
@@ -845,7 +846,7 @@ static int __init parse_memmap_one(char
 	if (!p)
 		return -EINVAL;
 
-	if (!strncmp(p, "exactmap", 8)) {
+	if (!strncmp(p, "exactmap", 8) || !strncmp(p, "exactusablemap", 14)) {
 		if (exactmap_parsed)
 			return 0;
 
@@ -858,6 +859,13 @@ static int __init parse_memmap_one(char
 		 */
 		saved_max_pfn = e820_end_of_ram_pfn();
 #endif
+		if (!strncmp(p, "exactusablemap", 14)) {
+			/* remove all old E820_RAM ranges */
+			e820_remove_range(0, ULLONG_MAX, E820_RAM, 1);
+			exactusablemap_parsed = true;
+			userdef = 1;
+			return 0;
+		}
 		e820.nr_map = 0;
 		userdef = 1;
 		return 0;
@@ -871,6 +879,11 @@ static int __init parse_memmap_one(char
 	userdef = 1;
 	if (*p == '@') {
 		start_at = memparse(p+1, &p);
+		if (exactusablemap_parsed) {
+			/* remove all range with other types */
+			e820_remove_range(start_at, mem_size,
+						 E820_RAM, 0);
+		}
 		e820_add_region(start_at, mem_size, E820_RAM);
 	} else if (*p == '#') {
 		start_at = memparse(p+1, &p);
@@ -890,6 +903,11 @@ static int __init parse_memmap_opt(char
 	p = strstr(p, "exactmap");
 	if (p)
 		parse_memmap_one("exactmap");
+	else {
+		p = strstr(boot_command_line, "exactusablemap");
+		if (p)
+			parse_memmap_one("exactusablemap");
+	}
 
 	while (str) {
 		char *k = strchr(str, ',');

WARNING: multiple messages have this Message-ID (diff)
From: Yinghai Lu <yinghai@kernel.org>
To: Thomas Renninger <trenn@suse.de>
Cc: horms@verge.net.au, Takao Indoh <indou.takao@jp.fujitsu.com>,
	MUNEDA Takahiro <muneda.takahiro@jp.fujitsu.com>,
	tokunaga.keiich@jp.fujitsu.com, linux-pci@vger.kernel.org,
	x86@kernel.org, kexec@lists.infradead.org,
	linux-kernel@vger.kernel.org, hbabu@us.ibm.com,
	andi@firstfloor.org, ddutile@redhat.com,
	ishii.hironobu@jp.fujitsu.com, "H. Peter Anvin" <hpa@zytor.com>,
	bhelgaas@google.com, tglx@linutronix.de, mingo@redhat.com,
	vgoyal@redhat.com, khalid@gonehiking.org
Subject: Re: [PATCH] x86 e820: only void usable memory areas in memmap=exactmap case
Date: Sun, 13 Jan 2013 18:43:46 -0800	[thread overview]
Message-ID: <CAE9FiQV+W21WcZuuWJYJKXhN4sf0mi=iAFgjX644c3EytPXptw@mail.gmail.com> (raw)
In-Reply-To: <2741365.qqWvlU8HGl@hammer82.arch.suse.de>

[-- Attachment #1: Type: text/plain, Size: 3628 bytes --]

On Sun, Jan 13, 2013 at 6:08 PM, Thomas Renninger <trenn@suse.de> wrote:
> On Saturday, January 12, 2013 09:07:12 AM Yinghai Lu wrote:
>> On Sat, Jan 12, 2013 at 3:31 AM, Thomas Renninger <trenn@suse.de> wrote:
>> >         memmap=exactmap [KNL,X86] Enable setting of an exact
>> >
>> > -                       E820 memory map, as specified by the user.
>> > -                       Such memmap=exactmap lines can be constructed
>> > based on -                       BIOS output or other requirements. See
>> > the memmap=nn@ss -                       option description.
>> > +                       E820 usable memory map, as specified by the user.
>> > +                       All unusable (reserved, ACPI, NVS,...) ranges from
>> > the +                       original e820 table are preserved.
>> > +                       But the usable memory regions from the original
>> > e820 +                       table are removed.
>> > +                       This parameter is explicitly for kdump usage:
>> > +                       The memory the kdump kernel is allowed to use must
>> > +                       be passed via below memmap=nn[KMG]@ss[KMG] param.
>> > +                       All reserved regions the kernel may use for
>> > ioremapping +                       and similar are still considered.
>> > +
>> > +       memmap=voidmap  [KNL,X86] Do not use any e820 ranges from BIOS or
>> > +                       bootloader. Instead you have to pass regions via
>> > +                       below memmap= options.
>>
>> I would suggest to keep memmap=exactmap meaning not changed, and add
>> memmap=exactusablemap
>> instead.
> I disagree.
> I would like to change memmap=exactmap behavior.
>
> Why:
> 1)
> This is a sever bug (for kdump/kexec). I could imagine quite a lot
> kdump related bugs get silently solved by this fix.
> I expect we agree that the change, however it looks like in the end, should
> still get in in this kernel round?
> With my approach, I would also suggest to spread this to stable kernels.
> -> No need to update/patch kexec-tools, things will just work as they should.
>
> 2)
> I would introduce 2 new memmap= options. However they look like, for example:
> =void_usable_map
> =void_orig_map
> and deprecated exactmap= via
> printk(KERN_INFO "exactmap usage changed and is deprecated\n");
> but still fix it.
> Latest kexec-tools will just use memmap=void_usable_map, old ones
> are still fixed via stable kernel updates.

everyone could understand it straightforward:
exactmap:  memmap will be specified, and it should be honored without
considering old any memmap.
exactusablemap: will make sure only old ram range get removed, and
specified usable ranges will become final usable
ranges in the final memmap. so we need to remove overlapping to old
reserved ranges.

aka: exact means EXACT ...

>
>
>> kexec-tools could be updated to support exactusablemap with
>> kernelversion checking for kdump.
> No, please not. It will be a maintenance/compatibility issue that will remain
> for years or ever in kdump and/or the kernel and it's not necessary.

I don't see there is any problem with it.

those just some kind of improvement without considering kdump.
because kdump/scripts already does good job to provide right exactmap
with usable and acpi reserved areas.
for mmconf, some system that range reserved in e820, and some have that in ACPI.
and those systems will have that mmconf enabled in kdumped kernel.

attached is what I like to have with exactusablemap, but maybe is not
needed, and we can just stay with exactmap...

ps: we don't need to add e820_remove_type...

Yinghai

[-- Attachment #2: e820_exactusablemap.patch --]
[-- Type: application/octet-stream, Size: 1819 bytes --]

---
 arch/x86/kernel/e820.c |   20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

Index: linux-2.6/arch/x86/kernel/e820.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/e820.c
+++ linux-2.6/arch/x86/kernel/e820.c
@@ -836,6 +836,7 @@ static int __init parse_memopt(char *p)
 early_param("mem", parse_memopt);
 
 static bool __initdata exactmap_parsed;
+static bool __initdata exactusablemap_parsed;
 
 static int __init parse_memmap_one(char *p)
 {
@@ -845,7 +846,7 @@ static int __init parse_memmap_one(char
 	if (!p)
 		return -EINVAL;
 
-	if (!strncmp(p, "exactmap", 8)) {
+	if (!strncmp(p, "exactmap", 8) || !strncmp(p, "exactusablemap", 14)) {
 		if (exactmap_parsed)
 			return 0;
 
@@ -858,6 +859,13 @@ static int __init parse_memmap_one(char
 		 */
 		saved_max_pfn = e820_end_of_ram_pfn();
 #endif
+		if (!strncmp(p, "exactusablemap", 14)) {
+			/* remove all old E820_RAM ranges */
+			e820_remove_range(0, ULLONG_MAX, E820_RAM, 1);
+			exactusablemap_parsed = true;
+			userdef = 1;
+			return 0;
+		}
 		e820.nr_map = 0;
 		userdef = 1;
 		return 0;
@@ -871,6 +879,11 @@ static int __init parse_memmap_one(char
 	userdef = 1;
 	if (*p == '@') {
 		start_at = memparse(p+1, &p);
+		if (exactusablemap_parsed) {
+			/* remove all range with other types */
+			e820_remove_range(start_at, mem_size,
+						 E820_RAM, 0);
+		}
 		e820_add_region(start_at, mem_size, E820_RAM);
 	} else if (*p == '#') {
 		start_at = memparse(p+1, &p);
@@ -890,6 +903,11 @@ static int __init parse_memmap_opt(char
 	p = strstr(p, "exactmap");
 	if (p)
 		parse_memmap_one("exactmap");
+	else {
+		p = strstr(boot_command_line, "exactusablemap");
+		if (p)
+			parse_memmap_one("exactusablemap");
+	}
 
 	while (str) {
 		char *k = strchr(str, ',');

[-- Attachment #3: Type: text/plain, Size: 143 bytes --]

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  reply	other threads:[~2013-01-14  2:43 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-27  0:42 [PATCH v7 0/5] Reset PCIe devices to address DMA problem on kdump with iommu Takao Indoh
2012-11-27  0:42 ` Takao Indoh
2012-11-27  0:42 ` [PATCH v7 1/5] x86, pci: add dummy pci device for early stage Takao Indoh
2012-11-27  0:42   ` Takao Indoh
2012-11-27  0:42 ` [PATCH v7 2/5] PCI: Define the maximum number of PCI function Takao Indoh
2012-11-27  0:42   ` Takao Indoh
2012-11-27  0:42 ` [PATCH v7 3/5] Make reset_devices available at early stage Takao Indoh
2012-11-27  0:42   ` Takao Indoh
2012-11-27  0:43 ` [PATCH v7 4/5] x86, pci: Reset PCIe devices at boot time Takao Indoh
2012-11-27  0:43   ` Takao Indoh
2012-11-27  0:43 ` [PATCH v7 5/5] x86, pci: Enable PCI INTx when MSI is disabled Takao Indoh
2012-11-27  0:43   ` Takao Indoh
2012-11-30 15:49 ` [PATCH v7 0/5] Reset PCIe devices to address DMA problem on kdump with iommu MUNEDA Takahiro
2012-11-30 15:49   ` MUNEDA Takahiro
2012-12-21 16:19   ` Yinghai Lu
2012-12-21 16:19     ` Yinghai Lu
2013-01-07 19:09     ` Thomas Renninger
2013-01-07 19:09       ` Thomas Renninger
2013-01-07 20:16       ` Yinghai Lu
2013-01-07 20:16         ` Yinghai Lu
2013-01-08  0:42         ` Thomas Renninger
2013-01-08  0:42           ` Thomas Renninger
2013-01-08  3:04           ` Yinghai Lu
2013-01-08  3:04             ` Yinghai Lu
2013-01-08 16:47             ` [PATCH] Only reset e820 once, even with multiple memmap=exactmap params Thomas Renninger
2013-01-08 16:47               ` Thomas Renninger
2013-01-08 17:19               ` Yinghai Lu
2013-01-08 17:19                 ` Yinghai Lu
2013-01-10  3:21                 ` Thomas Renninger
2013-01-10  3:21                   ` Thomas Renninger
2013-01-10 14:26                   ` Vivek Goyal
2013-01-10 14:26                     ` Vivek Goyal
2013-01-10 16:53                     ` Yinghai Lu
2013-01-10 16:53                       ` Yinghai Lu
2013-01-10 17:01                       ` Vivek Goyal
2013-01-10 17:01                         ` Vivek Goyal
2013-01-10 17:11                         ` Yinghai Lu
2013-01-10 17:11                           ` Yinghai Lu
2013-01-10 23:34                   ` Yinghai Lu
2013-01-11 12:33                     ` [PATCH] x86 e820: only void usable memory areas in memmap=exactmap case Thomas Renninger
2013-01-11 12:33                       ` Thomas Renninger
2013-01-11 16:16                       ` Yinghai Lu
2013-01-11 16:16                         ` Yinghai Lu
2013-01-11 18:24                         ` Thomas Renninger
2013-01-11 18:24                           ` Thomas Renninger
2013-01-11 19:59                           ` Yinghai Lu
2013-01-11 19:59                             ` Yinghai Lu
2013-01-11 20:06                             ` H. Peter Anvin
2013-01-11 20:06                               ` H. Peter Anvin
2013-01-11 21:09                               ` Yinghai Lu
2013-01-11 21:09                                 ` Yinghai Lu
2013-01-11 22:16                                 ` H. Peter Anvin
2013-01-11 22:16                                   ` H. Peter Anvin
2013-01-12 11:31                                   ` Thomas Renninger
2013-01-12 11:31                                     ` Thomas Renninger
2013-01-12 17:07                                     ` Yinghai Lu
2013-01-12 17:07                                       ` Yinghai Lu
2013-01-14  2:08                                       ` Thomas Renninger
2013-01-14  2:08                                         ` Thomas Renninger
2013-01-14  2:43                                         ` Yinghai Lu [this message]
2013-01-14  2:43                                           ` Yinghai Lu
2013-01-14 15:05                                           ` Thomas Renninger
2013-01-14 15:05                                             ` Thomas Renninger
2013-01-14 19:04                                             ` Yinghai Lu
2013-01-14 19:04                                               ` Yinghai Lu
2013-01-15  0:54                                               ` Thomas Renninger
2013-01-15  0:54                                                 ` Thomas Renninger
2013-01-15  4:45                                                 ` Yinghai Lu
2013-01-15  4:45                                                   ` Yinghai Lu
2013-01-22 15:21                                                   ` Thomas Renninger
2013-01-22 15:21                                                     ` Thomas Renninger
2013-01-08 16:50         ` [PATCH v7 0/5] Reset PCIe devices to address DMA problem on kdump with iommu Thomas Renninger
2013-01-08 16:50           ` Thomas Renninger
2013-01-08 17:27           ` Yinghai Lu
2013-01-08 17:27             ` Yinghai Lu
2013-01-09  2:32             ` Thomas Renninger
2013-01-09  2:32               ` Thomas Renninger
2013-01-09  4:39               ` Takao Indoh
2013-01-09  4:39                 ` Takao Indoh
2013-01-21  1:11       ` Takao Indoh
2013-01-21  1:11         ` Takao Indoh
2013-01-23  0:47         ` Thomas Renninger
2013-01-23  0:47           ` Thomas Renninger
2013-01-24  0:23           ` Takao Indoh
2013-01-24  0:23             ` Takao Indoh
2013-01-29  1:14             ` Thomas Renninger
2013-01-29  1:14               ` Thomas Renninger
2013-01-30  5:01               ` Takao Indoh
2013-01-30  5:01                 ` Takao Indoh
2013-03-04  0:56           ` Takao Indoh
2013-03-04  0:56             ` Takao Indoh
2013-03-04 22:00             ` Don Dutile
2013-03-04 22:00               ` Don Dutile
2013-03-05  0:56               ` Takao Indoh
2013-03-05  0:56                 ` Takao Indoh
2012-12-21  9:59 ` oliver yang
2012-12-21 10:37   ` Takao Indoh
2012-12-21 10:37     ` Takao Indoh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAE9FiQV+W21WcZuuWJYJKXhN4sf0mi=iAFgjX644c3EytPXptw@mail.gmail.com' \
    --to=yinghai@kernel.org \
    --cc=andi@firstfloor.org \
    --cc=bhelgaas@google.com \
    --cc=ddutile@redhat.com \
    --cc=hbabu@us.ibm.com \
    --cc=horms@verge.net.au \
    --cc=hpa@zytor.com \
    --cc=indou.takao@jp.fujitsu.com \
    --cc=ishii.hironobu@jp.fujitsu.com \
    --cc=kexec@lists.infradead.org \
    --cc=khalid@gonehiking.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=muneda.takahiro@jp.fujitsu.com \
    --cc=tglx@linutronix.de \
    --cc=tokunaga.keiich@jp.fujitsu.com \
    --cc=trenn@suse.de \
    --cc=vgoyal@redhat.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.