linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86: fix kaslr and memmap collision
@ 2016-11-22  0:22 Dave Jiang
  2016-11-22  8:47 ` Ingo Molnar
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Jiang @ 2016-11-22  0:22 UTC (permalink / raw)
  To: tglx, mingo, hpa; +Cc: dan.j.williams, x86, david, linux-kernel, linux-nvdimm

CONFIG_RANDOMIZE_BASE relocates the kernel to a random base address.
However it does not take into account the memmap= parameter passed in from
the kernel commandline. This results in the kernel sometimes being put in
the middle of the user memmap. Check has been added in the kaslr in order
to avoid the region marked by memmap.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 arch/x86/boot/boot.h             |    2 ++
 arch/x86/boot/compressed/kaslr.c |   45 ++++++++++++++++++++++++++++++++++++++
 arch/x86/boot/string.c           |   25 +++++++++++++++++++++
 3 files changed, 72 insertions(+)

diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
index e5612f3..0d5fe5b 100644
--- a/arch/x86/boot/boot.h
+++ b/arch/x86/boot/boot.h
@@ -332,6 +332,8 @@ int strncmp(const char *cs, const char *ct, size_t count);
 size_t strnlen(const char *s, size_t maxlen);
 unsigned int atou(const char *s);
 unsigned long long simple_strtoull(const char *cp, char **endp, unsigned int base);
+unsigned long simple_strtoul(const char *cp, char **endp, unsigned int base);
+long simple_strtol(const char *cp, char **endp, unsigned int base);
 size_t strlen(const char *s);
 
 /* tty.c */
diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index a66854d..6fb8f1ec 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -11,6 +11,7 @@
  */
 #include "misc.h"
 #include "error.h"
+#include "../boot.h"
 
 #include <generated/compile.h>
 #include <linux/module.h>
@@ -61,6 +62,7 @@ enum mem_avoid_index {
 	MEM_AVOID_INITRD,
 	MEM_AVOID_CMDLINE,
 	MEM_AVOID_BOOTPARAMS,
+	MEM_AVOID_MEMMAP,
 	MEM_AVOID_MAX,
 };
 
@@ -77,6 +79,37 @@ static bool mem_overlaps(struct mem_vector *one, struct mem_vector *two)
 	return true;
 }
 
+#include "../../../../lib/cmdline.c"
+
+static int
+parse_memmap(char *p, unsigned long long *start, unsigned long long *size)
+{
+	char *oldp;
+
+	if (!p)
+		return -EINVAL;
+
+	/* we don't care about this option here */
+	if (!strncmp(p, "exactmap", 8))
+		return -EINVAL;
+
+	oldp = p;
+	*size = memparse(p, &p);
+	if (p == oldp)
+		return -EINVAL;
+
+	switch (*p) {
+	case '@':
+	case '#':
+	case '$':
+	case '!':
+		*start = memparse(p+1, &p);
+		return 0;
+	}
+
+	return -EINVAL;
+}
+
 /*
  * In theory, KASLR can put the kernel anywhere in the range of [16M, 64T).
  * The mem_avoid array is used to store the ranges that need to be avoided
@@ -158,6 +191,8 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
 	u64 initrd_start, initrd_size;
 	u64 cmd_line, cmd_line_size;
 	char *ptr;
+	char arg[38];
+	unsigned long long memmap_start, memmap_size;
 
 	/*
 	 * Avoid the region that is unsafe to overlap during
@@ -195,6 +230,16 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
 	add_identity_map(mem_avoid[MEM_AVOID_BOOTPARAMS].start,
 			 mem_avoid[MEM_AVOID_BOOTPARAMS].size);
 
+	/* see if we have any memmap areas */
+	if (cmdline_find_option("memmap", arg, sizeof(arg)) > 0) {
+		int rc = parse_memmap(arg, &memmap_start, &memmap_size);
+
+		if (!rc) {
+			mem_avoid[MEM_AVOID_MEMMAP].start = memmap_start;
+			mem_avoid[MEM_AVOID_MEMMAP].size = memmap_size;
+		}
+	}
+
 	/* We don't need to set a mapping for setup_data. */
 
 #ifdef CONFIG_X86_VERBOSE_BOOTUP
diff --git a/arch/x86/boot/string.c b/arch/x86/boot/string.c
index cc3bd58..7a376c1 100644
--- a/arch/x86/boot/string.c
+++ b/arch/x86/boot/string.c
@@ -122,6 +122,31 @@ unsigned long long simple_strtoull(const char *cp, char **endp, unsigned int bas
 }
 
 /**
+ * simple_strtoul - convert a string to an unsigned long
+ * @cp: The start of the string
+ * @endp: A pointer to the end of the parsed string will be placed here
+ * @base: The number base to use
+ */
+unsigned long simple_strtoul(const char *cp, char **endp, unsigned int base)
+{
+	return simple_strtoull(cp, endp, base);
+}
+
+/**
+ * simple_strtol - convert a string to a signed long
+ * @cp: The start of the string
+ * @endp: A pointer to the end of the parsed string will be placed here
+ * @base: The number base to use
+ */
+long simple_strtol(const char *cp, char **endp, unsigned int base)
+{
+	if (*cp == '-')
+		return -simple_strtoul(cp + 1, endp, base);
+
+	return simple_strtoul(cp, endp, base);
+}
+
+/**
  * strlen - Find the length of a string
  * @s: The string to be sized
  */

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: fix kaslr and memmap collision
  2016-11-22  0:22 [PATCH] x86: fix kaslr and memmap collision Dave Jiang
@ 2016-11-22  8:47 ` Ingo Molnar
  2016-11-22 17:26   ` Dan Williams
  0 siblings, 1 reply; 13+ messages in thread
From: Ingo Molnar @ 2016-11-22  8:47 UTC (permalink / raw)
  To: Dave Jiang
  Cc: tglx, mingo, hpa, dan.j.williams, x86, david, linux-kernel,
	linux-nvdimm, Kees Cook


* Dave Jiang <dave.jiang@intel.com> wrote:

> CONFIG_RANDOMIZE_BASE relocates the kernel to a random base address.
> However it does not take into account the memmap= parameter passed in from
> the kernel commandline.

memmap= parameters are often used as a list.

> [...] This results in the kernel sometimes being put in the middle of the user 
> memmap. [...]

What does this mean? If memmap= is used to re-define the memory map then the 
kernel getting in the middle of a RAM area is what we want, isn't it? What we 
don't want is for the kernel to get into reserved areas, right?

> [...] Check has been added in the kaslr in order to avoid the region marked by 
> memmap.

What does this mean?

> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
>  arch/x86/boot/boot.h             |    2 ++
>  arch/x86/boot/compressed/kaslr.c |   45 ++++++++++++++++++++++++++++++++++++++
>  arch/x86/boot/string.c           |   25 +++++++++++++++++++++
>  3 files changed, 72 insertions(+)
> 
> diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
> index e5612f3..0d5fe5b 100644
> --- a/arch/x86/boot/boot.h
> +++ b/arch/x86/boot/boot.h
> @@ -332,6 +332,8 @@ int strncmp(const char *cs, const char *ct, size_t count);
>  size_t strnlen(const char *s, size_t maxlen);
>  unsigned int atou(const char *s);
>  unsigned long long simple_strtoull(const char *cp, char **endp, unsigned int base);
> +unsigned long simple_strtoul(const char *cp, char **endp, unsigned int base);
> +long simple_strtol(const char *cp, char **endp, unsigned int base);
>  size_t strlen(const char *s);
>  
>  /* tty.c */
> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
> index a66854d..6fb8f1ec 100644
> --- a/arch/x86/boot/compressed/kaslr.c
> +++ b/arch/x86/boot/compressed/kaslr.c
> @@ -11,6 +11,7 @@
>   */
>  #include "misc.h"
>  #include "error.h"
> +#include "../boot.h"
>  
>  #include <generated/compile.h>
>  #include <linux/module.h>
> @@ -61,6 +62,7 @@ enum mem_avoid_index {
>  	MEM_AVOID_INITRD,
>  	MEM_AVOID_CMDLINE,
>  	MEM_AVOID_BOOTPARAMS,
> +	MEM_AVOID_MEMMAP,
>  	MEM_AVOID_MAX,
>  };
>  
> @@ -77,6 +79,37 @@ static bool mem_overlaps(struct mem_vector *one, struct mem_vector *two)
>  	return true;
>  }
>  
> +#include "../../../../lib/cmdline.c"
> +
> +static int
> +parse_memmap(char *p, unsigned long long *start, unsigned long long *size)
> +{
> +	char *oldp;
> +
> +	if (!p)
> +		return -EINVAL;
> +
> +	/* we don't care about this option here */
> +	if (!strncmp(p, "exactmap", 8))
> +		return -EINVAL;
> +
> +	oldp = p;
> +	*size = memparse(p, &p);
> +	if (p == oldp)
> +		return -EINVAL;
> +
> +	switch (*p) {
> +	case '@':
> +	case '#':
> +	case '$':
> +	case '!':
> +		*start = memparse(p+1, &p);
> +		return 0;
> +	}
> +
> +	return -EINVAL;
> +}
> +
>  /*
>   * In theory, KASLR can put the kernel anywhere in the range of [16M, 64T).
>   * The mem_avoid array is used to store the ranges that need to be avoided
> @@ -158,6 +191,8 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
>  	u64 initrd_start, initrd_size;
>  	u64 cmd_line, cmd_line_size;
>  	char *ptr;
> +	char arg[38];

Where does the magic '38' come from?

> +	unsigned long long memmap_start, memmap_size;
>  
>  	/*
>  	 * Avoid the region that is unsafe to overlap during
> @@ -195,6 +230,16 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
>  	add_identity_map(mem_avoid[MEM_AVOID_BOOTPARAMS].start,
>  			 mem_avoid[MEM_AVOID_BOOTPARAMS].size);
>  
> +	/* see if we have any memmap areas */
> +	if (cmdline_find_option("memmap", arg, sizeof(arg)) > 0) {
> +		int rc = parse_memmap(arg, &memmap_start, &memmap_size);
> +
> +		if (!rc) {
> +			mem_avoid[MEM_AVOID_MEMMAP].start = memmap_start;
> +			mem_avoid[MEM_AVOID_MEMMAP].size = memmap_size;
> +		}
> +	}
> +

This only handles a single (first) memmap argument, is that sufficient?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: fix kaslr and memmap collision
  2016-11-22  8:47 ` Ingo Molnar
@ 2016-11-22 17:26   ` Dan Williams
  2016-11-22 18:54     ` Kees Cook
  2017-01-03  8:31     ` Baoquan He
  0 siblings, 2 replies; 13+ messages in thread
From: Dan Williams @ 2016-11-22 17:26 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Dave Jiang, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, X86 ML,
	david, linux-kernel, linux-nvdimm@lists.01.org, Kees Cook

[ replying for Dave since he's offline today and tomorrow ]

On Tue, Nov 22, 2016 at 12:47 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Dave Jiang <dave.jiang@intel.com> wrote:
>
>> CONFIG_RANDOMIZE_BASE relocates the kernel to a random base address.
>> However it does not take into account the memmap= parameter passed in from
>> the kernel commandline.
>
> memmap= parameters are often used as a list.
>
>> [...] This results in the kernel sometimes being put in the middle of the user
>> memmap. [...]
>
> What does this mean? If memmap= is used to re-define the memory map then the
> kernel getting in the middle of a RAM area is what we want, isn't it? What we
> don't want is for the kernel to get into reserved areas, right?

Right, this is about teaching kaslr to not land the kernel in newly
defined reserved regions that were not marked reserved in the initial
e820 map from platform firmware.

>> [...] Check has been added in the kaslr in order to avoid the region marked by
>> memmap.
>
> What does this mean?

Is this clearer? "Update the set of 'mem_avoid' entries to exclude
'memmap=' defined reserved regions from the set of valid address range
to land the kernel image."

>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>> ---
>>  arch/x86/boot/boot.h             |    2 ++
>>  arch/x86/boot/compressed/kaslr.c |   45 ++++++++++++++++++++++++++++++++++++++
>>  arch/x86/boot/string.c           |   25 +++++++++++++++++++++
>>  3 files changed, 72 insertions(+)
>>
>> diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
>> index e5612f3..0d5fe5b 100644
>> --- a/arch/x86/boot/boot.h
>> +++ b/arch/x86/boot/boot.h
>> @@ -332,6 +332,8 @@ int strncmp(const char *cs, const char *ct, size_t count);
>>  size_t strnlen(const char *s, size_t maxlen);
>>  unsigned int atou(const char *s);
>>  unsigned long long simple_strtoull(const char *cp, char **endp, unsigned int base);
>> +unsigned long simple_strtoul(const char *cp, char **endp, unsigned int base);
>> +long simple_strtol(const char *cp, char **endp, unsigned int base);
>>  size_t strlen(const char *s);
>>
>>  /* tty.c */
>> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
>> index a66854d..6fb8f1ec 100644
>> --- a/arch/x86/boot/compressed/kaslr.c
>> +++ b/arch/x86/boot/compressed/kaslr.c
>> @@ -11,6 +11,7 @@
>>   */
>>  #include "misc.h"
>>  #include "error.h"
>> +#include "../boot.h"
>>
>>  #include <generated/compile.h>
>>  #include <linux/module.h>
>> @@ -61,6 +62,7 @@ enum mem_avoid_index {
>>       MEM_AVOID_INITRD,
>>       MEM_AVOID_CMDLINE,
>>       MEM_AVOID_BOOTPARAMS,
>> +     MEM_AVOID_MEMMAP,
>>       MEM_AVOID_MAX,
>>  };
>>
>> @@ -77,6 +79,37 @@ static bool mem_overlaps(struct mem_vector *one, struct mem_vector *two)
>>       return true;
>>  }
>>
>> +#include "../../../../lib/cmdline.c"
>> +
>> +static int
>> +parse_memmap(char *p, unsigned long long *start, unsigned long long *size)
>> +{
>> +     char *oldp;
>> +
>> +     if (!p)
>> +             return -EINVAL;
>> +
>> +     /* we don't care about this option here */
>> +     if (!strncmp(p, "exactmap", 8))
>> +             return -EINVAL;
>> +
>> +     oldp = p;
>> +     *size = memparse(p, &p);
>> +     if (p == oldp)
>> +             return -EINVAL;
>> +
>> +     switch (*p) {
>> +     case '@':
>> +     case '#':
>> +     case '$':
>> +     case '!':
>> +             *start = memparse(p+1, &p);
>> +             return 0;
>> +     }
>> +
>> +     return -EINVAL;
>> +}
>> +
>>  /*
>>   * In theory, KASLR can put the kernel anywhere in the range of [16M, 64T).
>>   * The mem_avoid array is used to store the ranges that need to be avoided
>> @@ -158,6 +191,8 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
>>       u64 initrd_start, initrd_size;
>>       u64 cmd_line, cmd_line_size;
>>       char *ptr;
>> +     char arg[38];
>
> Where does the magic '38' come from?
>
>> +     unsigned long long memmap_start, memmap_size;
>>
>>       /*
>>        * Avoid the region that is unsafe to overlap during
>> @@ -195,6 +230,16 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
>>       add_identity_map(mem_avoid[MEM_AVOID_BOOTPARAMS].start,
>>                        mem_avoid[MEM_AVOID_BOOTPARAMS].size);
>>
>> +     /* see if we have any memmap areas */
>> +     if (cmdline_find_option("memmap", arg, sizeof(arg)) > 0) {
>> +             int rc = parse_memmap(arg, &memmap_start, &memmap_size);
>> +
>> +             if (!rc) {
>> +                     mem_avoid[MEM_AVOID_MEMMAP].start = memmap_start;
>> +                     mem_avoid[MEM_AVOID_MEMMAP].size = memmap_size;
>> +             }
>> +     }
>> +
>
> This only handles a single (first) memmap argument, is that sufficient?

No, you're right, we need to handle multiple ranges.  Since the
mem_avoid array is statically allocated perhaps we can handle up to 4
memmap= entries, but past that point disable kaslr for that boot?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: fix kaslr and memmap collision
  2016-11-22 17:26   ` Dan Williams
@ 2016-11-22 18:54     ` Kees Cook
  2016-11-22 19:01       ` Dan Williams
  2017-01-03  8:31     ` Baoquan He
  1 sibling, 1 reply; 13+ messages in thread
From: Kees Cook @ 2016-11-22 18:54 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ingo Molnar, Dave Jiang, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, X86 ML, david, linux-kernel,
	linux-nvdimm@lists.01.org

On Tue, Nov 22, 2016 at 9:26 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> [ replying for Dave since he's offline today and tomorrow ]
>
> On Tue, Nov 22, 2016 at 12:47 AM, Ingo Molnar <mingo@kernel.org> wrote:
>>
>> * Dave Jiang <dave.jiang@intel.com> wrote:
>>
>>> CONFIG_RANDOMIZE_BASE relocates the kernel to a random base address.
>>> However it does not take into account the memmap= parameter passed in from
>>> the kernel commandline.
>>
>> memmap= parameters are often used as a list.
>>
>>> [...] This results in the kernel sometimes being put in the middle of the user
>>> memmap. [...]
>>
>> What does this mean? If memmap= is used to re-define the memory map then the
>> kernel getting in the middle of a RAM area is what we want, isn't it? What we
>> don't want is for the kernel to get into reserved areas, right?
>
> Right, this is about teaching kaslr to not land the kernel in newly
> defined reserved regions that were not marked reserved in the initial
> e820 map from platform firmware.
>
>>> [...] Check has been added in the kaslr in order to avoid the region marked by
>>> memmap.
>>
>> What does this mean?
>
> Is this clearer? "Update the set of 'mem_avoid' entries to exclude
> 'memmap=' defined reserved regions from the set of valid address range
> to land the kernel image."
>
>>
>>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>> ---
>>>  arch/x86/boot/boot.h             |    2 ++
>>>  arch/x86/boot/compressed/kaslr.c |   45 ++++++++++++++++++++++++++++++++++++++
>>>  arch/x86/boot/string.c           |   25 +++++++++++++++++++++
>>>  3 files changed, 72 insertions(+)
>>>
>>> diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
>>> index e5612f3..0d5fe5b 100644
>>> --- a/arch/x86/boot/boot.h
>>> +++ b/arch/x86/boot/boot.h
>>> @@ -332,6 +332,8 @@ int strncmp(const char *cs, const char *ct, size_t count);
>>>  size_t strnlen(const char *s, size_t maxlen);
>>>  unsigned int atou(const char *s);
>>>  unsigned long long simple_strtoull(const char *cp, char **endp, unsigned int base);
>>> +unsigned long simple_strtoul(const char *cp, char **endp, unsigned int base);
>>> +long simple_strtol(const char *cp, char **endp, unsigned int base);
>>>  size_t strlen(const char *s);
>>>
>>>  /* tty.c */
>>> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
>>> index a66854d..6fb8f1ec 100644
>>> --- a/arch/x86/boot/compressed/kaslr.c
>>> +++ b/arch/x86/boot/compressed/kaslr.c
>>> @@ -11,6 +11,7 @@
>>>   */
>>>  #include "misc.h"
>>>  #include "error.h"
>>> +#include "../boot.h"
>>>
>>>  #include <generated/compile.h>
>>>  #include <linux/module.h>
>>> @@ -61,6 +62,7 @@ enum mem_avoid_index {
>>>       MEM_AVOID_INITRD,
>>>       MEM_AVOID_CMDLINE,
>>>       MEM_AVOID_BOOTPARAMS,
>>> +     MEM_AVOID_MEMMAP,
>>>       MEM_AVOID_MAX,
>>>  };
>>>
>>> @@ -77,6 +79,37 @@ static bool mem_overlaps(struct mem_vector *one, struct mem_vector *two)
>>>       return true;
>>>  }
>>>
>>> +#include "../../../../lib/cmdline.c"
>>> +
>>> +static int
>>> +parse_memmap(char *p, unsigned long long *start, unsigned long long *size)
>>> +{
>>> +     char *oldp;
>>> +
>>> +     if (!p)
>>> +             return -EINVAL;
>>> +
>>> +     /* we don't care about this option here */
>>> +     if (!strncmp(p, "exactmap", 8))
>>> +             return -EINVAL;
>>> +
>>> +     oldp = p;
>>> +     *size = memparse(p, &p);
>>> +     if (p == oldp)
>>> +             return -EINVAL;
>>> +
>>> +     switch (*p) {
>>> +     case '@':
>>> +     case '#':
>>> +     case '$':
>>> +     case '!':
>>> +             *start = memparse(p+1, &p);
>>> +             return 0;
>>> +     }
>>> +
>>> +     return -EINVAL;
>>> +}
>>> +
>>>  /*
>>>   * In theory, KASLR can put the kernel anywhere in the range of [16M, 64T).
>>>   * The mem_avoid array is used to store the ranges that need to be avoided
>>> @@ -158,6 +191,8 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
>>>       u64 initrd_start, initrd_size;
>>>       u64 cmd_line, cmd_line_size;
>>>       char *ptr;
>>> +     char arg[38];
>>
>> Where does the magic '38' come from?
>>
>>> +     unsigned long long memmap_start, memmap_size;
>>>
>>>       /*
>>>        * Avoid the region that is unsafe to overlap during
>>> @@ -195,6 +230,16 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
>>>       add_identity_map(mem_avoid[MEM_AVOID_BOOTPARAMS].start,
>>>                        mem_avoid[MEM_AVOID_BOOTPARAMS].size);
>>>
>>> +     /* see if we have any memmap areas */
>>> +     if (cmdline_find_option("memmap", arg, sizeof(arg)) > 0) {
>>> +             int rc = parse_memmap(arg, &memmap_start, &memmap_size);
>>> +
>>> +             if (!rc) {
>>> +                     mem_avoid[MEM_AVOID_MEMMAP].start = memmap_start;
>>> +                     mem_avoid[MEM_AVOID_MEMMAP].size = memmap_size;
>>> +             }
>>> +     }
>>> +
>>
>> This only handles a single (first) memmap argument, is that sufficient?
>
> No, you're right, we need to handle multiple ranges.  Since the
> mem_avoid array is statically allocated perhaps we can handle up to 4
> memmap= entries, but past that point disable kaslr for that boot?

Yeah, that seems fine to me. I assume it's rare to have 4?

-Kees

-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: fix kaslr and memmap collision
  2016-11-22 18:54     ` Kees Cook
@ 2016-11-22 19:01       ` Dan Williams
  2016-11-22 22:37         ` Kees Cook
  2016-11-24  0:04         ` Dave Chinner
  0 siblings, 2 replies; 13+ messages in thread
From: Dan Williams @ 2016-11-22 19:01 UTC (permalink / raw)
  To: Kees Cook
  Cc: Ingo Molnar, Dave Jiang, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, X86 ML, david, linux-kernel,
	linux-nvdimm@lists.01.org

On Tue, Nov 22, 2016 at 10:54 AM, Kees Cook <keescook@chromium.org> wrote:
> On Tue, Nov 22, 2016 at 9:26 AM, Dan Williams <dan.j.williams@intel.com> wrote:
>> [ replying for Dave since he's offline today and tomorrow ]
>>
>> On Tue, Nov 22, 2016 at 12:47 AM, Ingo Molnar <mingo@kernel.org> wrote:
>>>
>>> * Dave Jiang <dave.jiang@intel.com> wrote:
>>>
>>>> CONFIG_RANDOMIZE_BASE relocates the kernel to a random base address.
>>>> However it does not take into account the memmap= parameter passed in from
>>>> the kernel commandline.
>>>
>>> memmap= parameters are often used as a list.
>>>
>>>> [...] This results in the kernel sometimes being put in the middle of the user
>>>> memmap. [...]
>>>
>>> What does this mean? If memmap= is used to re-define the memory map then the
>>> kernel getting in the middle of a RAM area is what we want, isn't it? What we
>>> don't want is for the kernel to get into reserved areas, right?
>>
>> Right, this is about teaching kaslr to not land the kernel in newly
>> defined reserved regions that were not marked reserved in the initial
>> e820 map from platform firmware.
>>
>>>> [...] Check has been added in the kaslr in order to avoid the region marked by
>>>> memmap.
>>>
>>> What does this mean?
>>
>> Is this clearer? "Update the set of 'mem_avoid' entries to exclude
>> 'memmap=' defined reserved regions from the set of valid address range
>> to land the kernel image."
>>
>>>
>>>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>>> ---
>>>>  arch/x86/boot/boot.h             |    2 ++
>>>>  arch/x86/boot/compressed/kaslr.c |   45 ++++++++++++++++++++++++++++++++++++++
>>>>  arch/x86/boot/string.c           |   25 +++++++++++++++++++++
>>>>  3 files changed, 72 insertions(+)
>>>>
>>>> diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
>>>> index e5612f3..0d5fe5b 100644
>>>> --- a/arch/x86/boot/boot.h
>>>> +++ b/arch/x86/boot/boot.h
>>>> @@ -332,6 +332,8 @@ int strncmp(const char *cs, const char *ct, size_t count);
>>>>  size_t strnlen(const char *s, size_t maxlen);
>>>>  unsigned int atou(const char *s);
>>>>  unsigned long long simple_strtoull(const char *cp, char **endp, unsigned int base);
>>>> +unsigned long simple_strtoul(const char *cp, char **endp, unsigned int base);
>>>> +long simple_strtol(const char *cp, char **endp, unsigned int base);
>>>>  size_t strlen(const char *s);
>>>>
>>>>  /* tty.c */
>>>> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
>>>> index a66854d..6fb8f1ec 100644
>>>> --- a/arch/x86/boot/compressed/kaslr.c
>>>> +++ b/arch/x86/boot/compressed/kaslr.c
>>>> @@ -11,6 +11,7 @@
>>>>   */
>>>>  #include "misc.h"
>>>>  #include "error.h"
>>>> +#include "../boot.h"
>>>>
>>>>  #include <generated/compile.h>
>>>>  #include <linux/module.h>
>>>> @@ -61,6 +62,7 @@ enum mem_avoid_index {
>>>>       MEM_AVOID_INITRD,
>>>>       MEM_AVOID_CMDLINE,
>>>>       MEM_AVOID_BOOTPARAMS,
>>>> +     MEM_AVOID_MEMMAP,
>>>>       MEM_AVOID_MAX,
>>>>  };
>>>>
>>>> @@ -77,6 +79,37 @@ static bool mem_overlaps(struct mem_vector *one, struct mem_vector *two)
>>>>       return true;
>>>>  }
>>>>
>>>> +#include "../../../../lib/cmdline.c"
>>>> +
>>>> +static int
>>>> +parse_memmap(char *p, unsigned long long *start, unsigned long long *size)
>>>> +{
>>>> +     char *oldp;
>>>> +
>>>> +     if (!p)
>>>> +             return -EINVAL;
>>>> +
>>>> +     /* we don't care about this option here */
>>>> +     if (!strncmp(p, "exactmap", 8))
>>>> +             return -EINVAL;
>>>> +
>>>> +     oldp = p;
>>>> +     *size = memparse(p, &p);
>>>> +     if (p == oldp)
>>>> +             return -EINVAL;
>>>> +
>>>> +     switch (*p) {
>>>> +     case '@':
>>>> +     case '#':
>>>> +     case '$':
>>>> +     case '!':
>>>> +             *start = memparse(p+1, &p);
>>>> +             return 0;
>>>> +     }
>>>> +
>>>> +     return -EINVAL;
>>>> +}
>>>> +
>>>>  /*
>>>>   * In theory, KASLR can put the kernel anywhere in the range of [16M, 64T).
>>>>   * The mem_avoid array is used to store the ranges that need to be avoided
>>>> @@ -158,6 +191,8 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
>>>>       u64 initrd_start, initrd_size;
>>>>       u64 cmd_line, cmd_line_size;
>>>>       char *ptr;
>>>> +     char arg[38];
>>>
>>> Where does the magic '38' come from?
>>>
>>>> +     unsigned long long memmap_start, memmap_size;
>>>>
>>>>       /*
>>>>        * Avoid the region that is unsafe to overlap during
>>>> @@ -195,6 +230,16 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
>>>>       add_identity_map(mem_avoid[MEM_AVOID_BOOTPARAMS].start,
>>>>                        mem_avoid[MEM_AVOID_BOOTPARAMS].size);
>>>>
>>>> +     /* see if we have any memmap areas */
>>>> +     if (cmdline_find_option("memmap", arg, sizeof(arg)) > 0) {
>>>> +             int rc = parse_memmap(arg, &memmap_start, &memmap_size);
>>>> +
>>>> +             if (!rc) {
>>>> +                     mem_avoid[MEM_AVOID_MEMMAP].start = memmap_start;
>>>> +                     mem_avoid[MEM_AVOID_MEMMAP].size = memmap_size;
>>>> +             }
>>>> +     }
>>>> +
>>>
>>> This only handles a single (first) memmap argument, is that sufficient?
>>
>> No, you're right, we need to handle multiple ranges.  Since the
>> mem_avoid array is statically allocated perhaps we can handle up to 4
>> memmap= entries, but past that point disable kaslr for that boot?
>
> Yeah, that seems fine to me. I assume it's rare to have 4?
>

It should be rare to have *one* since ACPI 6.0 added support for
communicating persistent memory ranges.  However there are legacy
nvdimm users that I know are doing at least 2, but I have hard time
imagining they would ever do more than 4.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: fix kaslr and memmap collision
  2016-11-22 19:01       ` Dan Williams
@ 2016-11-22 22:37         ` Kees Cook
  2016-11-24  0:04         ` Dave Chinner
  1 sibling, 0 replies; 13+ messages in thread
From: Kees Cook @ 2016-11-22 22:37 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ingo Molnar, Dave Jiang, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, X86 ML, david, linux-kernel,
	linux-nvdimm@lists.01.org

On Tue, Nov 22, 2016 at 11:01 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> On Tue, Nov 22, 2016 at 10:54 AM, Kees Cook <keescook@chromium.org> wrote:
>> On Tue, Nov 22, 2016 at 9:26 AM, Dan Williams <dan.j.williams@intel.com> wrote:
>>> [ replying for Dave since he's offline today and tomorrow ]
>>>
>>> On Tue, Nov 22, 2016 at 12:47 AM, Ingo Molnar <mingo@kernel.org> wrote:
>>>>
>>>> * Dave Jiang <dave.jiang@intel.com> wrote:
>>>>
>>>>> CONFIG_RANDOMIZE_BASE relocates the kernel to a random base address.
>>>>> However it does not take into account the memmap= parameter passed in from
>>>>> the kernel commandline.
>>>>
>>>> memmap= parameters are often used as a list.
>>>>
>>>>> [...] This results in the kernel sometimes being put in the middle of the user
>>>>> memmap. [...]
>>>>
>>>> What does this mean? If memmap= is used to re-define the memory map then the
>>>> kernel getting in the middle of a RAM area is what we want, isn't it? What we
>>>> don't want is for the kernel to get into reserved areas, right?
>>>
>>> Right, this is about teaching kaslr to not land the kernel in newly
>>> defined reserved regions that were not marked reserved in the initial
>>> e820 map from platform firmware.
>>>
>>>>> [...] Check has been added in the kaslr in order to avoid the region marked by
>>>>> memmap.
>>>>
>>>> What does this mean?
>>>
>>> Is this clearer? "Update the set of 'mem_avoid' entries to exclude
>>> 'memmap=' defined reserved regions from the set of valid address range
>>> to land the kernel image."
>>>
>>>>
>>>>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>>>>> ---
>>>>>  arch/x86/boot/boot.h             |    2 ++
>>>>>  arch/x86/boot/compressed/kaslr.c |   45 ++++++++++++++++++++++++++++++++++++++
>>>>>  arch/x86/boot/string.c           |   25 +++++++++++++++++++++
>>>>>  3 files changed, 72 insertions(+)
>>>>>
>>>>> diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
>>>>> index e5612f3..0d5fe5b 100644
>>>>> --- a/arch/x86/boot/boot.h
>>>>> +++ b/arch/x86/boot/boot.h
>>>>> @@ -332,6 +332,8 @@ int strncmp(const char *cs, const char *ct, size_t count);
>>>>>  size_t strnlen(const char *s, size_t maxlen);
>>>>>  unsigned int atou(const char *s);
>>>>>  unsigned long long simple_strtoull(const char *cp, char **endp, unsigned int base);
>>>>> +unsigned long simple_strtoul(const char *cp, char **endp, unsigned int base);
>>>>> +long simple_strtol(const char *cp, char **endp, unsigned int base);
>>>>>  size_t strlen(const char *s);
>>>>>
>>>>>  /* tty.c */
>>>>> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
>>>>> index a66854d..6fb8f1ec 100644
>>>>> --- a/arch/x86/boot/compressed/kaslr.c
>>>>> +++ b/arch/x86/boot/compressed/kaslr.c
>>>>> @@ -11,6 +11,7 @@
>>>>>   */
>>>>>  #include "misc.h"
>>>>>  #include "error.h"
>>>>> +#include "../boot.h"
>>>>>
>>>>>  #include <generated/compile.h>
>>>>>  #include <linux/module.h>
>>>>> @@ -61,6 +62,7 @@ enum mem_avoid_index {
>>>>>       MEM_AVOID_INITRD,
>>>>>       MEM_AVOID_CMDLINE,
>>>>>       MEM_AVOID_BOOTPARAMS,
>>>>> +     MEM_AVOID_MEMMAP,
>>>>>       MEM_AVOID_MAX,
>>>>>  };
>>>>>
>>>>> @@ -77,6 +79,37 @@ static bool mem_overlaps(struct mem_vector *one, struct mem_vector *two)
>>>>>       return true;
>>>>>  }
>>>>>
>>>>> +#include "../../../../lib/cmdline.c"
>>>>> +
>>>>> +static int
>>>>> +parse_memmap(char *p, unsigned long long *start, unsigned long long *size)
>>>>> +{
>>>>> +     char *oldp;
>>>>> +
>>>>> +     if (!p)
>>>>> +             return -EINVAL;
>>>>> +
>>>>> +     /* we don't care about this option here */
>>>>> +     if (!strncmp(p, "exactmap", 8))
>>>>> +             return -EINVAL;
>>>>> +
>>>>> +     oldp = p;
>>>>> +     *size = memparse(p, &p);
>>>>> +     if (p == oldp)
>>>>> +             return -EINVAL;
>>>>> +
>>>>> +     switch (*p) {
>>>>> +     case '@':
>>>>> +     case '#':
>>>>> +     case '$':
>>>>> +     case '!':
>>>>> +             *start = memparse(p+1, &p);
>>>>> +             return 0;
>>>>> +     }
>>>>> +
>>>>> +     return -EINVAL;
>>>>> +}
>>>>> +
>>>>>  /*
>>>>>   * In theory, KASLR can put the kernel anywhere in the range of [16M, 64T).
>>>>>   * The mem_avoid array is used to store the ranges that need to be avoided
>>>>> @@ -158,6 +191,8 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
>>>>>       u64 initrd_start, initrd_size;
>>>>>       u64 cmd_line, cmd_line_size;
>>>>>       char *ptr;
>>>>> +     char arg[38];
>>>>
>>>> Where does the magic '38' come from?
>>>>
>>>>> +     unsigned long long memmap_start, memmap_size;
>>>>>
>>>>>       /*
>>>>>        * Avoid the region that is unsafe to overlap during
>>>>> @@ -195,6 +230,16 @@ static void mem_avoid_init(unsigned long input, unsigned long input_size,
>>>>>       add_identity_map(mem_avoid[MEM_AVOID_BOOTPARAMS].start,
>>>>>                        mem_avoid[MEM_AVOID_BOOTPARAMS].size);
>>>>>
>>>>> +     /* see if we have any memmap areas */
>>>>> +     if (cmdline_find_option("memmap", arg, sizeof(arg)) > 0) {
>>>>> +             int rc = parse_memmap(arg, &memmap_start, &memmap_size);
>>>>> +
>>>>> +             if (!rc) {
>>>>> +                     mem_avoid[MEM_AVOID_MEMMAP].start = memmap_start;
>>>>> +                     mem_avoid[MEM_AVOID_MEMMAP].size = memmap_size;
>>>>> +             }
>>>>> +     }
>>>>> +
>>>>
>>>> This only handles a single (first) memmap argument, is that sufficient?
>>>
>>> No, you're right, we need to handle multiple ranges.  Since the
>>> mem_avoid array is statically allocated perhaps we can handle up to 4
>>> memmap= entries, but past that point disable kaslr for that boot?
>>
>> Yeah, that seems fine to me. I assume it's rare to have 4?
>>
>
> It should be rare to have *one* since ACPI 6.0 added support for
> communicating persistent memory ranges.  However there are legacy
> nvdimm users that I know are doing at least 2, but I have hard time
> imagining they would ever do more than 4.

Cool. As long as it announces KASLR being disabled (as in some of the
other conditions) that should be fine.

-Kees

-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: fix kaslr and memmap collision
  2016-11-22 19:01       ` Dan Williams
  2016-11-22 22:37         ` Kees Cook
@ 2016-11-24  0:04         ` Dave Chinner
  2016-11-24 19:30           ` Dan Williams
  1 sibling, 1 reply; 13+ messages in thread
From: Dave Chinner @ 2016-11-24  0:04 UTC (permalink / raw)
  To: Dan Williams
  Cc: Kees Cook, Ingo Molnar, Dave Jiang, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, X86 ML, linux-kernel, linux-nvdimm@lists.01.org

On Tue, Nov 22, 2016 at 11:01:32AM -0800, Dan Williams wrote:
> On Tue, Nov 22, 2016 at 10:54 AM, Kees Cook <keescook@chromium.org> wrote:
> > On Tue, Nov 22, 2016 at 9:26 AM, Dan Williams <dan.j.williams@intel.com> wrote:
> >> No, you're right, we need to handle multiple ranges.  Since the
> >> mem_avoid array is statically allocated perhaps we can handle up to 4
> >> memmap= entries, but past that point disable kaslr for that boot?
> >
> > Yeah, that seems fine to me. I assume it's rare to have 4?
> >
> 
> It should be rare to have *one* since ACPI 6.0 added support for
> communicating persistent memory ranges.  However there are legacy
> nvdimm users that I know are doing at least 2, but I have hard time
> imagining they would ever do more than 4.

I doubt it's rare amongst the people using RAM to emulate pmem for
filesystem testing purposes. My "pmem" test VM always has at least 2
ranges set to give me two discrete pmem devices, and I have used 4
from time to time to do things like test multi-volume scratch XFS
filesystems in xfstests (i.e. data, log and realtime volumes) so I
didn't need to play games with partitioning or DM...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: fix kaslr and memmap collision
  2016-11-24  0:04         ` Dave Chinner
@ 2016-11-24 19:30           ` Dan Williams
  0 siblings, 0 replies; 13+ messages in thread
From: Dan Williams @ 2016-11-24 19:30 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Kees Cook, Ingo Molnar, Dave Jiang, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, X86 ML, linux-kernel, linux-nvdimm@lists.01.org

On Wed, Nov 23, 2016 at 4:04 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Tue, Nov 22, 2016 at 11:01:32AM -0800, Dan Williams wrote:
>> On Tue, Nov 22, 2016 at 10:54 AM, Kees Cook <keescook@chromium.org> wrote:
>> > On Tue, Nov 22, 2016 at 9:26 AM, Dan Williams <dan.j.williams@intel.com> wrote:
>> >> No, you're right, we need to handle multiple ranges.  Since the
>> >> mem_avoid array is statically allocated perhaps we can handle up to 4
>> >> memmap= entries, but past that point disable kaslr for that boot?
>> >
>> > Yeah, that seems fine to me. I assume it's rare to have 4?
>> >
>>
>> It should be rare to have *one* since ACPI 6.0 added support for
>> communicating persistent memory ranges.  However there are legacy
>> nvdimm users that I know are doing at least 2, but I have hard time
>> imagining they would ever do more than 4.
>
> I doubt it's rare amongst the people using RAM to emulate pmem for
> filesystem testing purposes. My "pmem" test VM always has at least 2
> ranges set to give me two discrete pmem devices, and I have used 4
> from time to time to do things like test multi-volume scratch XFS
> filesystems in xfstests (i.e. data, log and realtime volumes) so I
> didn't need to play games with partitioning or DM...

Right, but for testing do you need kaslr to be active? You can have as
many memmap regions as you want, we'll just stop trying to find a
random kernel base address after you've defined 4.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: fix kaslr and memmap collision
  2016-11-22 17:26   ` Dan Williams
  2016-11-22 18:54     ` Kees Cook
@ 2017-01-03  8:31     ` Baoquan He
  2017-01-03 16:27       ` Ross Zwisler
  2017-01-03 18:24       ` Dan Williams
  1 sibling, 2 replies; 13+ messages in thread
From: Baoquan He @ 2017-01-03  8:31 UTC (permalink / raw)
  To: Dan Williams
  Cc: Ingo Molnar, Dave Jiang, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, X86 ML, david, linux-kernel,
	linux-nvdimm@lists.01.org, Kees Cook, dyoung

Hi Dan,

On 11/22/16 at 09:26am, Dan Williams wrote:
> [ replying for Dave since he's offline today and tomorrow ]
> 
> On Tue, Nov 22, 2016 at 12:47 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >
> > * Dave Jiang <dave.jiang@intel.com> wrote:
> >
> >> CONFIG_RANDOMIZE_BASE relocates the kernel to a random base address.
> >> However it does not take into account the memmap= parameter passed in from
> >> the kernel commandline.
> >
> > memmap= parameters are often used as a list.
> >
> >> [...] This results in the kernel sometimes being put in the middle of the user
> >> memmap. [...]
> >
> > What does this mean? If memmap= is used to re-define the memory map then the
> > kernel getting in the middle of a RAM area is what we want, isn't it? What we
> > don't want is for the kernel to get into reserved areas, right?
> 
> Right, this is about teaching kaslr to not land the kernel in newly
> defined reserved regions that were not marked reserved in the initial
> e820 map from platform firmware.

If only tell kaslr to not land kernel in newly defined reserved regions,
memory added by "memmap=nn[KMG]@ss[KMG]" should not be avoided since
it's usable memory. Kernel randomized into this region is also what we
want. Not sure if I understand it right.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: fix kaslr and memmap collision
  2017-01-03  8:31     ` Baoquan He
@ 2017-01-03 16:27       ` Ross Zwisler
  2017-01-03 18:24       ` Dan Williams
  1 sibling, 0 replies; 13+ messages in thread
From: Ross Zwisler @ 2017-01-03 16:27 UTC (permalink / raw)
  To: Baoquan He
  Cc: Dan Williams, Kees Cook, linux-nvdimm@lists.01.org, X86 ML,
	david, linux-kernel, Ingo Molnar, H. Peter Anvin,
	Thomas Gleixner, dyoung, Ingo Molnar

On Tue, Jan 03, 2017 at 04:31:37PM +0800, Baoquan He wrote:
> Hi Dan,
> 
> On 11/22/16 at 09:26am, Dan Williams wrote:
> > [ replying for Dave since he's offline today and tomorrow ]
> > 
> > On Tue, Nov 22, 2016 at 12:47 AM, Ingo Molnar <mingo@kernel.org> wrote:
> > >
> > > * Dave Jiang <dave.jiang@intel.com> wrote:
> > >
> > >> CONFIG_RANDOMIZE_BASE relocates the kernel to a random base address.
> > >> However it does not take into account the memmap= parameter passed in from
> > >> the kernel commandline.
> > >
> > > memmap= parameters are often used as a list.
> > >
> > >> [...] This results in the kernel sometimes being put in the middle of the user
> > >> memmap. [...]
> > >
> > > What does this mean? If memmap= is used to re-define the memory map then the
> > > kernel getting in the middle of a RAM area is what we want, isn't it? What we
> > > don't want is for the kernel to get into reserved areas, right?
> > 
> > Right, this is about teaching kaslr to not land the kernel in newly
> > defined reserved regions that were not marked reserved in the initial
> > e820 map from platform firmware.
> 
> If only tell kaslr to not land kernel in newly defined reserved regions,
> memory added by "memmap=nn[KMG]@ss[KMG]" should not be avoided since
> it's usable memory. Kernel randomized into this region is also what we
> want. Not sure if I understand it right.

The following text is from:
https://nvdimm.wiki.kernel.org/how_to_choose_the_correct_memmap_kernel_parameter_for_pmem_on_your_system

Hopefully this will make it clearer.

---

Another thing that you may need to be aware of is the CONFIG_RANDOMIZE_BASE
kernel config option. When enabled, this randomizes the physical address at
which the kernel image is decompressed and the virtual address where the kernel
image is mapped. Currently this random address is chosen without regard to the
memmap kernel command line parameter.

This means that the kernel can choose to put itself in the middle of your
reserved memmap area. You can observe this behavior via /proc/iomem.

Here is /proc/iomem from a system with CONFIG_RANDOMIZE_BASE turned off:

# cat /proc/iomem
00000000-00000fff : reserved
00001000-0009fbff : System RAM
0009fc00-0009ffff : reserved
000a0000-000bffff : PCI Bus 0000:00
000c0000-000c97ff : Video ROM
000c9800-000ca5ff : Adapter ROM
000ca800-000ccbff : Adapter ROM
000f0000-000fffff : reserved
  000f0000-000fffff : System ROM
00100000-bffd8fff : System RAM
  01000000-01b18598 : Kernel code
  01b18599-023f53ff : Kernel data
  0276d000-0365efff : Kernel bss
bffd9000-bfffffff : reserved
c0000000-febfffff : PCI Bus 0000:00
  f4000000-f7ffffff : 0000:00:02.0
  f8000000-fbffffff : 0000:00:02.0
  fc000000-fc03ffff : 0000:00:03.0
  fc050000-fc051fff : 0000:00:02.0
  fc052000-fc052fff : 0000:00:03.0
  fc053000-fc053fff : 0000:00:04.0
  fc054000-fc054fff : 0000:00:05.7
    fc054000-fc054fff : ehci_hcd
  fc055000-fc055fff : 0000:00:06.0
fec00000-fec003ff : IOAPIC 0
fee00000-fee00fff : Local APIC
feffc000-feffffff : reserved
fffc0000-ffffffff : reserved
100000000-4ffffffff : Persistent Memory (legacy)
  100000000-4ffffffff : namespace0.0
500000000-53fffffff : System RAM

The interesting bits for us are the “System RAM” region from 00100000-bffd8fff,
and the “Persistent Memory (legacy)” region from 100000000-4ffffffff.

If I turn on CONFIG_RANDOMIZE_BASE on this same system, I get the following:

# cat /proc/iomem
00000000-00000fff : reserved
00001000-0009fbff : System RAM
0009fc00-0009ffff : reserved
000a0000-000bffff : PCI Bus 0000:00
000c0000-000c97ff : Video ROM
000c9800-000ca5ff : Adapter ROM
000ca800-000ccbff : Adapter ROM
000f0000-000fffff : reserved
  000f0000-000fffff : System ROM
00100000-bffd8fff : System RAM
bffd9000-bfffffff : reserved
c0000000-febfffff : PCI Bus 0000:00
  f4000000-f7ffffff : 0000:00:02.0
  f8000000-fbffffff : 0000:00:02.0
  fc000000-fc03ffff : 0000:00:03.0
  fc050000-fc051fff : 0000:00:02.0
  fc052000-fc052fff : 0000:00:03.0
  fc053000-fc053fff : 0000:00:04.0
  fc054000-fc054fff : 0000:00:05.7
    fc054000-fc054fff : ehci_hcd
  fc055000-fc055fff : 0000:00:06.0
fec00000-fec003ff : IOAPIC 0
fee00000-fee00fff : Local APIC
feffc000-feffffff : reserved
fffc0000-ffffffff : reserved
100000000-4e6ffffff : Persistent Memory (legacy)
4e7000000-4e968bfff : System RAM
  4e7000000-4e7b185d8 : Kernel code
  4e7b185d9-4e83f54bf : Kernel data
  4e876d000-4e965efff : Kernel bss
4e968c000-4ffffffff : Persistent Memory (legacy)
500000000-53fffffff : System RAM

The “System RAM” region now sits in the middle of my “Persistent Memory
(legacy)” region, splitting it in half. This results in the following kernel
WARNING:

[    6.356180] WARNING: CPU: 4 PID: 689 at kernel/memremap.c:300 devm_memremap_pages+0x3b2/0x4c0
[    6.357757] devm_memremap_pages attempted on mixed region [mem 0x4e968c000-0x4ffffffff flags 0x200]

and no /dev/pmem* devices being created.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: fix kaslr and memmap collision
  2017-01-03  8:31     ` Baoquan He
  2017-01-03 16:27       ` Ross Zwisler
@ 2017-01-03 18:24       ` Dan Williams
  2017-01-03 20:15         ` Dave Jiang
  1 sibling, 1 reply; 13+ messages in thread
From: Dan Williams @ 2017-01-03 18:24 UTC (permalink / raw)
  To: Baoquan He
  Cc: Ingo Molnar, Dave Jiang, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, X86 ML, david, linux-kernel,
	linux-nvdimm@lists.01.org, Kees Cook, dyoung

On Tue, Jan 3, 2017 at 12:31 AM, Baoquan He <bhe@redhat.com> wrote:
> Hi Dan,
>
> On 11/22/16 at 09:26am, Dan Williams wrote:
>> [ replying for Dave since he's offline today and tomorrow ]
>>
>> On Tue, Nov 22, 2016 at 12:47 AM, Ingo Molnar <mingo@kernel.org> wrote:
>> >
>> > * Dave Jiang <dave.jiang@intel.com> wrote:
>> >
>> >> CONFIG_RANDOMIZE_BASE relocates the kernel to a random base address.
>> >> However it does not take into account the memmap= parameter passed in from
>> >> the kernel commandline.
>> >
>> > memmap= parameters are often used as a list.
>> >
>> >> [...] This results in the kernel sometimes being put in the middle of the user
>> >> memmap. [...]
>> >
>> > What does this mean? If memmap= is used to re-define the memory map then the
>> > kernel getting in the middle of a RAM area is what we want, isn't it? What we
>> > don't want is for the kernel to get into reserved areas, right?
>>
>> Right, this is about teaching kaslr to not land the kernel in newly
>> defined reserved regions that were not marked reserved in the initial
>> e820 map from platform firmware.
>
> If only tell kaslr to not land kernel in newly defined reserved regions,
> memory added by "memmap=nn[KMG]@ss[KMG]" should not be avoided since
> it's usable memory. Kernel randomized into this region is also what we
> want. Not sure if I understand it right.

You're right, this is supposed to be for memmap=nn!ss cases which
defines reserved persistent memory ranges, not memmap=nn@ss which
defines usable memory.

We need to fix mem_avoid_memmap() to only skip memmap= statements that
specify reserved memory.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: fix kaslr and memmap collision
  2017-01-03 18:24       ` Dan Williams
@ 2017-01-03 20:15         ` Dave Jiang
  2017-01-04  1:57           ` Baoquan He
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Jiang @ 2017-01-03 20:15 UTC (permalink / raw)
  To: Dan Williams, Baoquan He
  Cc: Ingo Molnar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	X86 ML, david, linux-kernel, linux-nvdimm@lists.01.org,
	Kees Cook, dyoung



On 01/03/2017 11:24 AM, Dan Williams wrote:
> On Tue, Jan 3, 2017 at 12:31 AM, Baoquan He <bhe@redhat.com> wrote:
>> Hi Dan,
>>
>> On 11/22/16 at 09:26am, Dan Williams wrote:
>>> [ replying for Dave since he's offline today and tomorrow ]
>>>
>>> On Tue, Nov 22, 2016 at 12:47 AM, Ingo Molnar <mingo@kernel.org> wrote:
>>>>
>>>> * Dave Jiang <dave.jiang@intel.com> wrote:
>>>>
>>>>> CONFIG_RANDOMIZE_BASE relocates the kernel to a random base address.
>>>>> However it does not take into account the memmap= parameter passed in from
>>>>> the kernel commandline.
>>>>
>>>> memmap= parameters are often used as a list.
>>>>
>>>>> [...] This results in the kernel sometimes being put in the middle of the user
>>>>> memmap. [...]
>>>>
>>>> What does this mean? If memmap= is used to re-define the memory map then the
>>>> kernel getting in the middle of a RAM area is what we want, isn't it? What we
>>>> don't want is for the kernel to get into reserved areas, right?
>>>
>>> Right, this is about teaching kaslr to not land the kernel in newly
>>> defined reserved regions that were not marked reserved in the initial
>>> e820 map from platform firmware.
>>
>> If only tell kaslr to not land kernel in newly defined reserved regions,
>> memory added by "memmap=nn[KMG]@ss[KMG]" should not be avoided since
>> it's usable memory. Kernel randomized into this region is also what we
>> want. Not sure if I understand it right.
> 
> You're right, this is supposed to be for memmap=nn!ss cases which
> defines reserved persistent memory ranges, not memmap=nn@ss which
> defines usable memory.
> 
> We need to fix mem_avoid_memmap() to only skip memmap= statements that
> specify reserved memory.
> 

I think nn@ss is the only one that we should skip over, otherwise
everything else looks like should be avoided. I'll update.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] x86: fix kaslr and memmap collision
  2017-01-03 20:15         ` Dave Jiang
@ 2017-01-04  1:57           ` Baoquan He
  0 siblings, 0 replies; 13+ messages in thread
From: Baoquan He @ 2017-01-04  1:57 UTC (permalink / raw)
  To: Dan Williams, Dave Jiang
  Cc: Ingo Molnar, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	X86 ML, david, linux-kernel, linux-nvdimm@lists.01.org,
	Kees Cook, dyoung

On 01/03/17 at 01:15pm, Dave Jiang wrote:
> 
> 
> On 01/03/2017 11:24 AM, Dan Williams wrote:
> > On Tue, Jan 3, 2017 at 12:31 AM, Baoquan He <bhe@redhat.com> wrote:
> >> Hi Dan,
> >>
> >> On 11/22/16 at 09:26am, Dan Williams wrote:
> >>> [ replying for Dave since he's offline today and tomorrow ]
> >>>
> >>> On Tue, Nov 22, 2016 at 12:47 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >>>>
> >>>> * Dave Jiang <dave.jiang@intel.com> wrote:
> >>>>
> >>>>> CONFIG_RANDOMIZE_BASE relocates the kernel to a random base address.
> >>>>> However it does not take into account the memmap= parameter passed in from
> >>>>> the kernel commandline.
> >>>>
> >>>> memmap= parameters are often used as a list.
> >>>>
> >>>>> [...] This results in the kernel sometimes being put in the middle of the user
> >>>>> memmap. [...]
> >>>>
> >>>> What does this mean? If memmap= is used to re-define the memory map then the
> >>>> kernel getting in the middle of a RAM area is what we want, isn't it? What we
> >>>> don't want is for the kernel to get into reserved areas, right?
> >>>
> >>> Right, this is about teaching kaslr to not land the kernel in newly
> >>> defined reserved regions that were not marked reserved in the initial
> >>> e820 map from platform firmware.
> >>
> >> If only tell kaslr to not land kernel in newly defined reserved regions,
> >> memory added by "memmap=nn[KMG]@ss[KMG]" should not be avoided since
> >> it's usable memory. Kernel randomized into this region is also what we
> >> want. Not sure if I understand it right.
> > 
> > You're right, this is supposed to be for memmap=nn!ss cases which
> > defines reserved persistent memory ranges, not memmap=nn@ss which
> > defines usable memory.
> > 
> > We need to fix mem_avoid_memmap() to only skip memmap= statements that
> > specify reserved memory.

Thanks for confirmation, Dan!

> > 
> 
> I think nn@ss is the only one that we should skip over, otherwise
> everything else looks like should be avoided. I'll update.
Hi Dave,

I guess your purpose is to avoid the user defined reserved memory and
pmem which I am not very sure about since kaslr won't stamp on ACPI
region reported by BIOS. Seems OK to avoid them all except of nn@ss.

I have other concerns, will directly comment in your v4 post.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-01-04  1:58 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-22  0:22 [PATCH] x86: fix kaslr and memmap collision Dave Jiang
2016-11-22  8:47 ` Ingo Molnar
2016-11-22 17:26   ` Dan Williams
2016-11-22 18:54     ` Kees Cook
2016-11-22 19:01       ` Dan Williams
2016-11-22 22:37         ` Kees Cook
2016-11-24  0:04         ` Dave Chinner
2016-11-24 19:30           ` Dan Williams
2017-01-03  8:31     ` Baoquan He
2017-01-03 16:27       ` Ross Zwisler
2017-01-03 18:24       ` Dan Williams
2017-01-03 20:15         ` Dave Jiang
2017-01-04  1:57           ` Baoquan He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).