All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 0/9] x86/mm: memory area address KASLR
@ 2016-06-22  0:46 ` Kees Cook
  0 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

This is v7 of Thomas Garnier's KASLR for memory areas (physical memory
mapping, vmalloc, vmemmap). It expects to be applied on top of the
x86/boot tip.

The current implementation of KASLR randomizes only the base address of
the kernel and its modules. Research was published showing that static
memory addresses can be found and used in exploits, effectively ignoring
base address KASLR:

   The physical memory mapping holds most allocations from boot and
   heap allocators. Knowning the base address and physical memory
   size, an attacker can deduce the PDE virtual address for the vDSO
   memory page.  This attack was demonstrated at CanSecWest 2016, in
   the "Getting Physical: Extreme Abuse of Intel Based Paged Systems"
   https://goo.gl/ANpWdV (see second part of the presentation). The
   exploits used against Linux worked successfuly against 4.6+ but fail
   with KASLR memory enabled (https://goo.gl/iTtXMJ). Similar research
   was done at Google leading to this patch proposal. Variants exists
   to overwrite /proc or /sys objects ACLs leading to elevation of
   privileges.  These variants were tested against 4.6+.

This set of patches randomizes the base address and padding of three
major memory sections (physical memory mapping, vmalloc, and vmemmap).
It mitigates exploits relying on predictable kernel addresses in these
areas. This feature can be enabled with the CONFIG_RANDOMIZE_MEMORY
option. (This CONFIG, along with CONFIG_RANDOMIZE may be renamed in
the future, but stands for now as other architectures continue to
implement KASLR.)

Padding for the memory hotplug support is managed by
CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING. The default value is 10
terabytes.

The patches were tested on qemu & physical machines. Xen compatibility was
also verified. Multiple reboots were used to verify entropy for each
memory section.

Notable problems that needed solving:
 - The three target memory sections need to not be at the same place
   across reboots.
 - The physical memory mapping can use a virtual address not aligned on
   the PGD page table.
 - Reasonable entropy is needed early at boot before get_random_bytes()
   is available.
 - Memory hotplug needs KASLR padding.

Patches:
 - 1: refactor KASLR functions (moves them from boot/compressed/ into lib/)
 - 2: clarifies the variables used for physical mapping.
 - 3: PUD virtual address support for physical mapping.
 - 4: split out the trampoline PGD
 - 5: KASLR memory infrastructure code
 - 6: randomize base of physical mapping region
 - 7: randomize base of vmalloc region
 - 8: randomize base of vmemmap region
 - 9: provide memory hotplug padding support

There is no measurable performance impact:

 - Kernbench shows almost no difference (-+ less than 1%).
 - Hackbench shows 0% difference on average (hackbench 90 repeated 10 times).

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
@ 2016-06-22  0:46 ` Kees Cook
  0 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

This is v7 of Thomas Garnier's KASLR for memory areas (physical memory
mapping, vmalloc, vmemmap). It expects to be applied on top of the
x86/boot tip.

The current implementation of KASLR randomizes only the base address of
the kernel and its modules. Research was published showing that static
memory addresses can be found and used in exploits, effectively ignoring
base address KASLR:

   The physical memory mapping holds most allocations from boot and
   heap allocators. Knowning the base address and physical memory
   size, an attacker can deduce the PDE virtual address for the vDSO
   memory page.  This attack was demonstrated at CanSecWest 2016, in
   the "Getting Physical: Extreme Abuse of Intel Based Paged Systems"
   https://goo.gl/ANpWdV (see second part of the presentation). The
   exploits used against Linux worked successfuly against 4.6+ but fail
   with KASLR memory enabled (https://goo.gl/iTtXMJ). Similar research
   was done at Google leading to this patch proposal. Variants exists
   to overwrite /proc or /sys objects ACLs leading to elevation of
   privileges.  These variants were tested against 4.6+.

This set of patches randomizes the base address and padding of three
major memory sections (physical memory mapping, vmalloc, and vmemmap).
It mitigates exploits relying on predictable kernel addresses in these
areas. This feature can be enabled with the CONFIG_RANDOMIZE_MEMORY
option. (This CONFIG, along with CONFIG_RANDOMIZE may be renamed in
the future, but stands for now as other architectures continue to
implement KASLR.)

Padding for the memory hotplug support is managed by
CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING. The default value is 10
terabytes.

The patches were tested on qemu & physical machines. Xen compatibility was
also verified. Multiple reboots were used to verify entropy for each
memory section.

Notable problems that needed solving:
 - The three target memory sections need to not be at the same place
   across reboots.
 - The physical memory mapping can use a virtual address not aligned on
   the PGD page table.
 - Reasonable entropy is needed early at boot before get_random_bytes()
   is available.
 - Memory hotplug needs KASLR padding.

Patches:
 - 1: refactor KASLR functions (moves them from boot/compressed/ into lib/)
 - 2: clarifies the variables used for physical mapping.
 - 3: PUD virtual address support for physical mapping.
 - 4: split out the trampoline PGD
 - 5: KASLR memory infrastructure code
 - 6: randomize base of physical mapping region
 - 7: randomize base of vmalloc region
 - 8: randomize base of vmemmap region
 - 9: provide memory hotplug padding support

There is no measurable performance impact:

 - Kernbench shows almost no difference (-+ less than 1%).
 - Hackbench shows 0% difference on average (hackbench 90 repeated 10 times).

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v7 1/9] x86/mm: Refactor KASLR entropy functions
  2016-06-22  0:46 ` [kernel-hardening] " Kees Cook
@ 2016-06-22  0:46   ` Kees Cook
  -1 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

From: Thomas Garnier <thgarnie@google.com>

Move the KASLR entropy functions into arch/x86/lib to be used in early
kernel boot for KASLR memory randomization.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/boot/compressed/kaslr.c | 76 +++------------------------------
 arch/x86/include/asm/kaslr.h     |  6 +++
 arch/x86/lib/Makefile            |  1 +
 arch/x86/lib/kaslr.c             | 90 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 102 insertions(+), 71 deletions(-)
 create mode 100644 arch/x86/include/asm/kaslr.h
 create mode 100644 arch/x86/lib/kaslr.c

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 749c9e00c674..1781a8dbae46 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -12,10 +12,6 @@
 #include "misc.h"
 #include "error.h"
 
-#include <asm/msr.h>
-#include <asm/archrandom.h>
-#include <asm/e820.h>
-
 #include <generated/compile.h>
 #include <linux/module.h>
 #include <linux/uts.h>
@@ -26,26 +22,6 @@
 static const char build_str[] = UTS_RELEASE " (" LINUX_COMPILE_BY "@"
 		LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION;
 
-#define I8254_PORT_CONTROL	0x43
-#define I8254_PORT_COUNTER0	0x40
-#define I8254_CMD_READBACK	0xC0
-#define I8254_SELECT_COUNTER0	0x02
-#define I8254_STATUS_NOTREADY	0x40
-static inline u16 i8254(void)
-{
-	u16 status, timer;
-
-	do {
-		outb(I8254_PORT_CONTROL,
-		     I8254_CMD_READBACK | I8254_SELECT_COUNTER0);
-		status = inb(I8254_PORT_COUNTER0);
-		timer  = inb(I8254_PORT_COUNTER0);
-		timer |= inb(I8254_PORT_COUNTER0) << 8;
-	} while (status & I8254_STATUS_NOTREADY);
-
-	return timer;
-}
-
 static unsigned long rotate_xor(unsigned long hash, const void *area,
 				size_t size)
 {
@@ -62,7 +38,7 @@ static unsigned long rotate_xor(unsigned long hash, const void *area,
 }
 
 /* Attempt to create a simple but unpredictable starting entropy. */
-static unsigned long get_random_boot(void)
+static unsigned long get_boot_seed(void)
 {
 	unsigned long hash = 0;
 
@@ -72,50 +48,8 @@ static unsigned long get_random_boot(void)
 	return hash;
 }
 
-static unsigned long get_random_long(const char *purpose)
-{
-#ifdef CONFIG_X86_64
-	const unsigned long mix_const = 0x5d6008cbf3848dd3UL;
-#else
-	const unsigned long mix_const = 0x3f39e593UL;
-#endif
-	unsigned long raw, random = get_random_boot();
-	bool use_i8254 = true;
-
-	debug_putstr(purpose);
-	debug_putstr(" KASLR using");
-
-	if (has_cpuflag(X86_FEATURE_RDRAND)) {
-		debug_putstr(" RDRAND");
-		if (rdrand_long(&raw)) {
-			random ^= raw;
-			use_i8254 = false;
-		}
-	}
-
-	if (has_cpuflag(X86_FEATURE_TSC)) {
-		debug_putstr(" RDTSC");
-		raw = rdtsc();
-
-		random ^= raw;
-		use_i8254 = false;
-	}
-
-	if (use_i8254) {
-		debug_putstr(" i8254");
-		random ^= i8254();
-	}
-
-	/* Circular multiply for better bit diffusion */
-	asm("mul %3"
-	    : "=a" (random), "=d" (raw)
-	    : "a" (random), "rm" (mix_const));
-	random += raw;
-
-	debug_putstr("...\n");
-
-	return random;
-}
+#define KASLR_COMPRESSED_BOOT
+#include "../../lib/kaslr.c"
 
 struct mem_vector {
 	unsigned long start;
@@ -347,7 +281,7 @@ static unsigned long slots_fetch_random(void)
 	if (slot_max == 0)
 		return 0;
 
-	slot = get_random_long("Physical") % slot_max;
+	slot = kaslr_get_random_long("Physical") % slot_max;
 
 	for (i = 0; i < slot_area_index; i++) {
 		if (slot >= slot_areas[i].num) {
@@ -477,7 +411,7 @@ static unsigned long find_random_virt_addr(unsigned long minimum,
 	slots = (KERNEL_IMAGE_SIZE - minimum - image_size) /
 		 CONFIG_PHYSICAL_ALIGN + 1;
 
-	random_addr = get_random_long("Virtual") % slots;
+	random_addr = kaslr_get_random_long("Virtual") % slots;
 
 	return random_addr * CONFIG_PHYSICAL_ALIGN + minimum;
 }
diff --git a/arch/x86/include/asm/kaslr.h b/arch/x86/include/asm/kaslr.h
new file mode 100644
index 000000000000..5547438db5ea
--- /dev/null
+++ b/arch/x86/include/asm/kaslr.h
@@ -0,0 +1,6 @@
+#ifndef _ASM_KASLR_H_
+#define _ASM_KASLR_H_
+
+unsigned long kaslr_get_random_long(const char *purpose);
+
+#endif
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index 72a576752a7e..cfa6d076f4f2 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -24,6 +24,7 @@ lib-y += usercopy_$(BITS).o usercopy.o getuser.o putuser.o
 lib-y += memcpy_$(BITS).o
 lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
 lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o
+lib-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
 
 obj-y += msr.o msr-reg.o msr-reg-export.o
 
diff --git a/arch/x86/lib/kaslr.c b/arch/x86/lib/kaslr.c
new file mode 100644
index 000000000000..f7dfeda83e5c
--- /dev/null
+++ b/arch/x86/lib/kaslr.c
@@ -0,0 +1,90 @@
+/*
+ * Entropy functions used on early boot for KASLR base and memory
+ * randomization. The base randomization is done in the compressed
+ * kernel and memory randomization is done early when the regular
+ * kernel starts. This file is included in the compressed kernel and
+ * normally linked in the regular.
+ */
+#include <asm/kaslr.h>
+#include <asm/msr.h>
+#include <asm/archrandom.h>
+#include <asm/e820.h>
+#include <asm/io.h>
+
+/*
+ * When built for the regular kernel, several functions need to be stubbed out
+ * or changed to their regular kernel equivalent.
+ */
+#ifndef KASLR_COMPRESSED_BOOT
+#include <asm/cpufeature.h>
+#include <asm/setup.h>
+
+#define debug_putstr(v) early_printk(v)
+#define has_cpuflag(f) boot_cpu_has(f)
+#define get_boot_seed() kaslr_offset()
+#endif
+
+#define I8254_PORT_CONTROL	0x43
+#define I8254_PORT_COUNTER0	0x40
+#define I8254_CMD_READBACK	0xC0
+#define I8254_SELECT_COUNTER0	0x02
+#define I8254_STATUS_NOTREADY	0x40
+static inline u16 i8254(void)
+{
+	u16 status, timer;
+
+	do {
+		outb(I8254_PORT_CONTROL,
+		     I8254_CMD_READBACK | I8254_SELECT_COUNTER0);
+		status = inb(I8254_PORT_COUNTER0);
+		timer  = inb(I8254_PORT_COUNTER0);
+		timer |= inb(I8254_PORT_COUNTER0) << 8;
+	} while (status & I8254_STATUS_NOTREADY);
+
+	return timer;
+}
+
+unsigned long kaslr_get_random_long(const char *purpose)
+{
+#ifdef CONFIG_X86_64
+	const unsigned long mix_const = 0x5d6008cbf3848dd3UL;
+#else
+	const unsigned long mix_const = 0x3f39e593UL;
+#endif
+	unsigned long raw, random = get_boot_seed();
+	bool use_i8254 = true;
+
+	debug_putstr(purpose);
+	debug_putstr(" KASLR using");
+
+	if (has_cpuflag(X86_FEATURE_RDRAND)) {
+		debug_putstr(" RDRAND");
+		if (rdrand_long(&raw)) {
+			random ^= raw;
+			use_i8254 = false;
+		}
+	}
+
+	if (has_cpuflag(X86_FEATURE_TSC)) {
+		debug_putstr(" RDTSC");
+		raw = rdtsc();
+
+		random ^= raw;
+		use_i8254 = false;
+	}
+
+	if (use_i8254) {
+		debug_putstr(" i8254");
+		random ^= i8254();
+	}
+
+	/* Circular multiply for better bit diffusion */
+	asm("mul %3"
+	    : "=a" (random), "=d" (raw)
+	    : "a" (random), "rm" (mix_const));
+	random += raw;
+
+	debug_putstr("...\n");
+
+	return random;
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [kernel-hardening] [PATCH v7 1/9] x86/mm: Refactor KASLR entropy functions
@ 2016-06-22  0:46   ` Kees Cook
  0 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

From: Thomas Garnier <thgarnie@google.com>

Move the KASLR entropy functions into arch/x86/lib to be used in early
kernel boot for KASLR memory randomization.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/boot/compressed/kaslr.c | 76 +++------------------------------
 arch/x86/include/asm/kaslr.h     |  6 +++
 arch/x86/lib/Makefile            |  1 +
 arch/x86/lib/kaslr.c             | 90 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 102 insertions(+), 71 deletions(-)
 create mode 100644 arch/x86/include/asm/kaslr.h
 create mode 100644 arch/x86/lib/kaslr.c

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 749c9e00c674..1781a8dbae46 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -12,10 +12,6 @@
 #include "misc.h"
 #include "error.h"
 
-#include <asm/msr.h>
-#include <asm/archrandom.h>
-#include <asm/e820.h>
-
 #include <generated/compile.h>
 #include <linux/module.h>
 #include <linux/uts.h>
@@ -26,26 +22,6 @@
 static const char build_str[] = UTS_RELEASE " (" LINUX_COMPILE_BY "@"
 		LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION;
 
-#define I8254_PORT_CONTROL	0x43
-#define I8254_PORT_COUNTER0	0x40
-#define I8254_CMD_READBACK	0xC0
-#define I8254_SELECT_COUNTER0	0x02
-#define I8254_STATUS_NOTREADY	0x40
-static inline u16 i8254(void)
-{
-	u16 status, timer;
-
-	do {
-		outb(I8254_PORT_CONTROL,
-		     I8254_CMD_READBACK | I8254_SELECT_COUNTER0);
-		status = inb(I8254_PORT_COUNTER0);
-		timer  = inb(I8254_PORT_COUNTER0);
-		timer |= inb(I8254_PORT_COUNTER0) << 8;
-	} while (status & I8254_STATUS_NOTREADY);
-
-	return timer;
-}
-
 static unsigned long rotate_xor(unsigned long hash, const void *area,
 				size_t size)
 {
@@ -62,7 +38,7 @@ static unsigned long rotate_xor(unsigned long hash, const void *area,
 }
 
 /* Attempt to create a simple but unpredictable starting entropy. */
-static unsigned long get_random_boot(void)
+static unsigned long get_boot_seed(void)
 {
 	unsigned long hash = 0;
 
@@ -72,50 +48,8 @@ static unsigned long get_random_boot(void)
 	return hash;
 }
 
-static unsigned long get_random_long(const char *purpose)
-{
-#ifdef CONFIG_X86_64
-	const unsigned long mix_const = 0x5d6008cbf3848dd3UL;
-#else
-	const unsigned long mix_const = 0x3f39e593UL;
-#endif
-	unsigned long raw, random = get_random_boot();
-	bool use_i8254 = true;
-
-	debug_putstr(purpose);
-	debug_putstr(" KASLR using");
-
-	if (has_cpuflag(X86_FEATURE_RDRAND)) {
-		debug_putstr(" RDRAND");
-		if (rdrand_long(&raw)) {
-			random ^= raw;
-			use_i8254 = false;
-		}
-	}
-
-	if (has_cpuflag(X86_FEATURE_TSC)) {
-		debug_putstr(" RDTSC");
-		raw = rdtsc();
-
-		random ^= raw;
-		use_i8254 = false;
-	}
-
-	if (use_i8254) {
-		debug_putstr(" i8254");
-		random ^= i8254();
-	}
-
-	/* Circular multiply for better bit diffusion */
-	asm("mul %3"
-	    : "=a" (random), "=d" (raw)
-	    : "a" (random), "rm" (mix_const));
-	random += raw;
-
-	debug_putstr("...\n");
-
-	return random;
-}
+#define KASLR_COMPRESSED_BOOT
+#include "../../lib/kaslr.c"
 
 struct mem_vector {
 	unsigned long start;
@@ -347,7 +281,7 @@ static unsigned long slots_fetch_random(void)
 	if (slot_max == 0)
 		return 0;
 
-	slot = get_random_long("Physical") % slot_max;
+	slot = kaslr_get_random_long("Physical") % slot_max;
 
 	for (i = 0; i < slot_area_index; i++) {
 		if (slot >= slot_areas[i].num) {
@@ -477,7 +411,7 @@ static unsigned long find_random_virt_addr(unsigned long minimum,
 	slots = (KERNEL_IMAGE_SIZE - minimum - image_size) /
 		 CONFIG_PHYSICAL_ALIGN + 1;
 
-	random_addr = get_random_long("Virtual") % slots;
+	random_addr = kaslr_get_random_long("Virtual") % slots;
 
 	return random_addr * CONFIG_PHYSICAL_ALIGN + minimum;
 }
diff --git a/arch/x86/include/asm/kaslr.h b/arch/x86/include/asm/kaslr.h
new file mode 100644
index 000000000000..5547438db5ea
--- /dev/null
+++ b/arch/x86/include/asm/kaslr.h
@@ -0,0 +1,6 @@
+#ifndef _ASM_KASLR_H_
+#define _ASM_KASLR_H_
+
+unsigned long kaslr_get_random_long(const char *purpose);
+
+#endif
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index 72a576752a7e..cfa6d076f4f2 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -24,6 +24,7 @@ lib-y += usercopy_$(BITS).o usercopy.o getuser.o putuser.o
 lib-y += memcpy_$(BITS).o
 lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
 lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o
+lib-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
 
 obj-y += msr.o msr-reg.o msr-reg-export.o
 
diff --git a/arch/x86/lib/kaslr.c b/arch/x86/lib/kaslr.c
new file mode 100644
index 000000000000..f7dfeda83e5c
--- /dev/null
+++ b/arch/x86/lib/kaslr.c
@@ -0,0 +1,90 @@
+/*
+ * Entropy functions used on early boot for KASLR base and memory
+ * randomization. The base randomization is done in the compressed
+ * kernel and memory randomization is done early when the regular
+ * kernel starts. This file is included in the compressed kernel and
+ * normally linked in the regular.
+ */
+#include <asm/kaslr.h>
+#include <asm/msr.h>
+#include <asm/archrandom.h>
+#include <asm/e820.h>
+#include <asm/io.h>
+
+/*
+ * When built for the regular kernel, several functions need to be stubbed out
+ * or changed to their regular kernel equivalent.
+ */
+#ifndef KASLR_COMPRESSED_BOOT
+#include <asm/cpufeature.h>
+#include <asm/setup.h>
+
+#define debug_putstr(v) early_printk(v)
+#define has_cpuflag(f) boot_cpu_has(f)
+#define get_boot_seed() kaslr_offset()
+#endif
+
+#define I8254_PORT_CONTROL	0x43
+#define I8254_PORT_COUNTER0	0x40
+#define I8254_CMD_READBACK	0xC0
+#define I8254_SELECT_COUNTER0	0x02
+#define I8254_STATUS_NOTREADY	0x40
+static inline u16 i8254(void)
+{
+	u16 status, timer;
+
+	do {
+		outb(I8254_PORT_CONTROL,
+		     I8254_CMD_READBACK | I8254_SELECT_COUNTER0);
+		status = inb(I8254_PORT_COUNTER0);
+		timer  = inb(I8254_PORT_COUNTER0);
+		timer |= inb(I8254_PORT_COUNTER0) << 8;
+	} while (status & I8254_STATUS_NOTREADY);
+
+	return timer;
+}
+
+unsigned long kaslr_get_random_long(const char *purpose)
+{
+#ifdef CONFIG_X86_64
+	const unsigned long mix_const = 0x5d6008cbf3848dd3UL;
+#else
+	const unsigned long mix_const = 0x3f39e593UL;
+#endif
+	unsigned long raw, random = get_boot_seed();
+	bool use_i8254 = true;
+
+	debug_putstr(purpose);
+	debug_putstr(" KASLR using");
+
+	if (has_cpuflag(X86_FEATURE_RDRAND)) {
+		debug_putstr(" RDRAND");
+		if (rdrand_long(&raw)) {
+			random ^= raw;
+			use_i8254 = false;
+		}
+	}
+
+	if (has_cpuflag(X86_FEATURE_TSC)) {
+		debug_putstr(" RDTSC");
+		raw = rdtsc();
+
+		random ^= raw;
+		use_i8254 = false;
+	}
+
+	if (use_i8254) {
+		debug_putstr(" i8254");
+		random ^= i8254();
+	}
+
+	/* Circular multiply for better bit diffusion */
+	asm("mul %3"
+	    : "=a" (random), "=d" (raw)
+	    : "a" (random), "rm" (mix_const));
+	random += raw;
+
+	debug_putstr("...\n");
+
+	return random;
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v7 2/9] x86/mm: Update physical mapping variable names (x86_64)
  2016-06-22  0:46 ` [kernel-hardening] " Kees Cook
@ 2016-06-22  0:46   ` Kees Cook
  -1 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

From: Thomas Garnier <thgarnie@google.com>

Change the variable names on kernel_physical_mapping_init and related
functions to correctly reflect physical and virtual memory addresses.
Also add comments on each function to describe usage and alignment
constraints.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/mm/init_64.c | 162 ++++++++++++++++++++++++++++++--------------------
 1 file changed, 96 insertions(+), 66 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index bce2e5d9edd4..6714712bd5da 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -328,22 +328,30 @@ void __init cleanup_highmap(void)
 	}
 }
 
+/*
+ * Create PTE level page table mapping for physical addresses.
+ * It returns the last physical address mapped.
+ */
 static unsigned long __meminit
-phys_pte_init(pte_t *pte_page, unsigned long addr, unsigned long end,
+phys_pte_init(pte_t *pte_page, unsigned long paddr, unsigned long paddr_end,
 	      pgprot_t prot)
 {
-	unsigned long pages = 0, next;
-	unsigned long last_map_addr = end;
+	unsigned long pages = 0, paddr_next;
+	unsigned long paddr_last = paddr_end;
+	pte_t *pte;
 	int i;
 
-	pte_t *pte = pte_page + pte_index(addr);
+	pte = pte_page + pte_index(paddr);
+	i = pte_index(paddr);
 
-	for (i = pte_index(addr); i < PTRS_PER_PTE; i++, addr = next, pte++) {
-		next = (addr & PAGE_MASK) + PAGE_SIZE;
-		if (addr >= end) {
+	for (; i < PTRS_PER_PTE; i++, paddr = paddr_next, pte++) {
+		paddr_next = (paddr & PAGE_MASK) + PAGE_SIZE;
+		if (paddr >= paddr_end) {
 			if (!after_bootmem &&
-			    !e820_any_mapped(addr & PAGE_MASK, next, E820_RAM) &&
-			    !e820_any_mapped(addr & PAGE_MASK, next, E820_RESERVED_KERN))
+			    !e820_any_mapped(paddr & PAGE_MASK, paddr_next,
+					     E820_RAM) &&
+			    !e820_any_mapped(paddr & PAGE_MASK, paddr_next,
+					     E820_RESERVED_KERN))
 				set_pte(pte, __pte(0));
 			continue;
 		}
@@ -361,37 +369,44 @@ phys_pte_init(pte_t *pte_page, unsigned long addr, unsigned long end,
 		}
 
 		if (0)
-			printk("   pte=%p addr=%lx pte=%016lx\n",
-			       pte, addr, pfn_pte(addr >> PAGE_SHIFT, PAGE_KERNEL).pte);
+			pr_info("   pte=%p addr=%lx pte=%016lx\n", pte, paddr,
+				pfn_pte(paddr >> PAGE_SHIFT, PAGE_KERNEL).pte);
 		pages++;
-		set_pte(pte, pfn_pte(addr >> PAGE_SHIFT, prot));
-		last_map_addr = (addr & PAGE_MASK) + PAGE_SIZE;
+		set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot));
+		paddr_last = (paddr & PAGE_MASK) + PAGE_SIZE;
 	}
 
 	update_page_count(PG_LEVEL_4K, pages);
 
-	return last_map_addr;
+	return paddr_last;
 }
 
+/*
+ * Create PMD level page table mapping for physical addresses. The virtual
+ * and physical address have to be aligned at this level.
+ * It returns the last physical address mapped.
+ */
 static unsigned long __meminit
-phys_pmd_init(pmd_t *pmd_page, unsigned long address, unsigned long end,
+phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
 	      unsigned long page_size_mask, pgprot_t prot)
 {
-	unsigned long pages = 0, next;
-	unsigned long last_map_addr = end;
+	unsigned long pages = 0, paddr_next;
+	unsigned long paddr_last = paddr_end;
 
-	int i = pmd_index(address);
+	int i = pmd_index(paddr);
 
-	for (; i < PTRS_PER_PMD; i++, address = next) {
-		pmd_t *pmd = pmd_page + pmd_index(address);
+	for (; i < PTRS_PER_PMD; i++, paddr = paddr_next) {
+		pmd_t *pmd = pmd_page + pmd_index(paddr);
 		pte_t *pte;
 		pgprot_t new_prot = prot;
 
-		next = (address & PMD_MASK) + PMD_SIZE;
-		if (address >= end) {
+		paddr_next = (paddr & PMD_MASK) + PMD_SIZE;
+		if (paddr >= paddr_end) {
 			if (!after_bootmem &&
-			    !e820_any_mapped(address & PMD_MASK, next, E820_RAM) &&
-			    !e820_any_mapped(address & PMD_MASK, next, E820_RESERVED_KERN))
+			    !e820_any_mapped(paddr & PMD_MASK, paddr_next,
+					     E820_RAM) &&
+			    !e820_any_mapped(paddr & PMD_MASK, paddr_next,
+					     E820_RESERVED_KERN))
 				set_pmd(pmd, __pmd(0));
 			continue;
 		}
@@ -400,8 +415,8 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long address, unsigned long end,
 			if (!pmd_large(*pmd)) {
 				spin_lock(&init_mm.page_table_lock);
 				pte = (pte_t *)pmd_page_vaddr(*pmd);
-				last_map_addr = phys_pte_init(pte, address,
-								end, prot);
+				paddr_last = phys_pte_init(pte, paddr,
+							   paddr_end, prot);
 				spin_unlock(&init_mm.page_table_lock);
 				continue;
 			}
@@ -420,7 +435,7 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long address, unsigned long end,
 			if (page_size_mask & (1 << PG_LEVEL_2M)) {
 				if (!after_bootmem)
 					pages++;
-				last_map_addr = next;
+				paddr_last = paddr_next;
 				continue;
 			}
 			new_prot = pte_pgprot(pte_clrhuge(*(pte_t *)pmd));
@@ -430,42 +445,49 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long address, unsigned long end,
 			pages++;
 			spin_lock(&init_mm.page_table_lock);
 			set_pte((pte_t *)pmd,
-				pfn_pte((address & PMD_MASK) >> PAGE_SHIFT,
+				pfn_pte((paddr & PMD_MASK) >> PAGE_SHIFT,
 					__pgprot(pgprot_val(prot) | _PAGE_PSE)));
 			spin_unlock(&init_mm.page_table_lock);
-			last_map_addr = next;
+			paddr_last = paddr_next;
 			continue;
 		}
 
 		pte = alloc_low_page();
-		last_map_addr = phys_pte_init(pte, address, end, new_prot);
+		paddr_last = phys_pte_init(pte, paddr, paddr_end, new_prot);
 
 		spin_lock(&init_mm.page_table_lock);
 		pmd_populate_kernel(&init_mm, pmd, pte);
 		spin_unlock(&init_mm.page_table_lock);
 	}
 	update_page_count(PG_LEVEL_2M, pages);
-	return last_map_addr;
+	return paddr_last;
 }
 
+/*
+ * Create PUD level page table mapping for physical addresses. The virtual
+ * and physical address have to be aligned at this level.
+ * It returns the last physical address mapped.
+ */
 static unsigned long __meminit
-phys_pud_init(pud_t *pud_page, unsigned long addr, unsigned long end,
-			 unsigned long page_size_mask)
+phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
+	      unsigned long page_size_mask)
 {
-	unsigned long pages = 0, next;
-	unsigned long last_map_addr = end;
-	int i = pud_index(addr);
+	unsigned long pages = 0, paddr_next;
+	unsigned long paddr_last = paddr_end;
+	int i = pud_index(paddr);
 
-	for (; i < PTRS_PER_PUD; i++, addr = next) {
-		pud_t *pud = pud_page + pud_index(addr);
+	for (; i < PTRS_PER_PUD; i++, paddr = paddr_next) {
+		pud_t *pud = pud_page + pud_index(paddr);
 		pmd_t *pmd;
 		pgprot_t prot = PAGE_KERNEL;
 
-		next = (addr & PUD_MASK) + PUD_SIZE;
-		if (addr >= end) {
+		paddr_next = (paddr & PUD_MASK) + PUD_SIZE;
+		if (paddr >= paddr_end) {
 			if (!after_bootmem &&
-			    !e820_any_mapped(addr & PUD_MASK, next, E820_RAM) &&
-			    !e820_any_mapped(addr & PUD_MASK, next, E820_RESERVED_KERN))
+			    !e820_any_mapped(paddr & PUD_MASK, paddr_next,
+					     E820_RAM) &&
+			    !e820_any_mapped(paddr & PUD_MASK, paddr_next,
+					     E820_RESERVED_KERN))
 				set_pud(pud, __pud(0));
 			continue;
 		}
@@ -473,8 +495,10 @@ phys_pud_init(pud_t *pud_page, unsigned long addr, unsigned long end,
 		if (pud_val(*pud)) {
 			if (!pud_large(*pud)) {
 				pmd = pmd_offset(pud, 0);
-				last_map_addr = phys_pmd_init(pmd, addr, end,
-							 page_size_mask, prot);
+				paddr_last = phys_pmd_init(pmd, paddr,
+							   paddr_end,
+							   page_size_mask,
+							   prot);
 				__flush_tlb_all();
 				continue;
 			}
@@ -493,7 +517,7 @@ phys_pud_init(pud_t *pud_page, unsigned long addr, unsigned long end,
 			if (page_size_mask & (1 << PG_LEVEL_1G)) {
 				if (!after_bootmem)
 					pages++;
-				last_map_addr = next;
+				paddr_last = paddr_next;
 				continue;
 			}
 			prot = pte_pgprot(pte_clrhuge(*(pte_t *)pud));
@@ -503,16 +527,16 @@ phys_pud_init(pud_t *pud_page, unsigned long addr, unsigned long end,
 			pages++;
 			spin_lock(&init_mm.page_table_lock);
 			set_pte((pte_t *)pud,
-				pfn_pte((addr & PUD_MASK) >> PAGE_SHIFT,
+				pfn_pte((paddr & PUD_MASK) >> PAGE_SHIFT,
 					PAGE_KERNEL_LARGE));
 			spin_unlock(&init_mm.page_table_lock);
-			last_map_addr = next;
+			paddr_last = paddr_next;
 			continue;
 		}
 
 		pmd = alloc_low_page();
-		last_map_addr = phys_pmd_init(pmd, addr, end, page_size_mask,
-					      prot);
+		paddr_last = phys_pmd_init(pmd, paddr, paddr_end,
+					   page_size_mask, prot);
 
 		spin_lock(&init_mm.page_table_lock);
 		pud_populate(&init_mm, pud, pmd);
@@ -522,38 +546,44 @@ phys_pud_init(pud_t *pud_page, unsigned long addr, unsigned long end,
 
 	update_page_count(PG_LEVEL_1G, pages);
 
-	return last_map_addr;
+	return paddr_last;
 }
 
+/*
+ * Create page table mapping for the physical memory for specific physical
+ * addresses. The virtual and physical addresses have to be aligned on PUD level
+ * down. It returns the last physical address mapped.
+ */
 unsigned long __meminit
-kernel_physical_mapping_init(unsigned long start,
-			     unsigned long end,
+kernel_physical_mapping_init(unsigned long paddr_start,
+			     unsigned long paddr_end,
 			     unsigned long page_size_mask)
 {
 	bool pgd_changed = false;
-	unsigned long next, last_map_addr = end;
-	unsigned long addr;
+	unsigned long vaddr, vaddr_start, vaddr_end, vaddr_next, paddr_last;
 
-	start = (unsigned long)__va(start);
-	end = (unsigned long)__va(end);
-	addr = start;
+	paddr_last = paddr_end;
+	vaddr = (unsigned long)__va(paddr_start);
+	vaddr_end = (unsigned long)__va(paddr_end);
+	vaddr_start = vaddr;
 
-	for (; start < end; start = next) {
-		pgd_t *pgd = pgd_offset_k(start);
+	for (; vaddr < vaddr_end; vaddr = vaddr_next) {
+		pgd_t *pgd = pgd_offset_k(vaddr);
 		pud_t *pud;
 
-		next = (start & PGDIR_MASK) + PGDIR_SIZE;
+		vaddr_next = (vaddr & PGDIR_MASK) + PGDIR_SIZE;
 
 		if (pgd_val(*pgd)) {
 			pud = (pud_t *)pgd_page_vaddr(*pgd);
-			last_map_addr = phys_pud_init(pud, __pa(start),
-						 __pa(end), page_size_mask);
+			paddr_last = phys_pud_init(pud, __pa(vaddr),
+						   __pa(vaddr_end),
+						   page_size_mask);
 			continue;
 		}
 
 		pud = alloc_low_page();
-		last_map_addr = phys_pud_init(pud, __pa(start), __pa(end),
-						 page_size_mask);
+		paddr_last = phys_pud_init(pud, __pa(vaddr), __pa(vaddr_end),
+					   page_size_mask);
 
 		spin_lock(&init_mm.page_table_lock);
 		pgd_populate(&init_mm, pgd, pud);
@@ -562,11 +592,11 @@ kernel_physical_mapping_init(unsigned long start,
 	}
 
 	if (pgd_changed)
-		sync_global_pgds(addr, end - 1, 0);
+		sync_global_pgds(vaddr_start, vaddr_end - 1, 0);
 
 	__flush_tlb_all();
 
-	return last_map_addr;
+	return paddr_last;
 }
 
 #ifndef CONFIG_NUMA
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [kernel-hardening] [PATCH v7 2/9] x86/mm: Update physical mapping variable names (x86_64)
@ 2016-06-22  0:46   ` Kees Cook
  0 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

From: Thomas Garnier <thgarnie@google.com>

Change the variable names on kernel_physical_mapping_init and related
functions to correctly reflect physical and virtual memory addresses.
Also add comments on each function to describe usage and alignment
constraints.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/mm/init_64.c | 162 ++++++++++++++++++++++++++++++--------------------
 1 file changed, 96 insertions(+), 66 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index bce2e5d9edd4..6714712bd5da 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -328,22 +328,30 @@ void __init cleanup_highmap(void)
 	}
 }
 
+/*
+ * Create PTE level page table mapping for physical addresses.
+ * It returns the last physical address mapped.
+ */
 static unsigned long __meminit
-phys_pte_init(pte_t *pte_page, unsigned long addr, unsigned long end,
+phys_pte_init(pte_t *pte_page, unsigned long paddr, unsigned long paddr_end,
 	      pgprot_t prot)
 {
-	unsigned long pages = 0, next;
-	unsigned long last_map_addr = end;
+	unsigned long pages = 0, paddr_next;
+	unsigned long paddr_last = paddr_end;
+	pte_t *pte;
 	int i;
 
-	pte_t *pte = pte_page + pte_index(addr);
+	pte = pte_page + pte_index(paddr);
+	i = pte_index(paddr);
 
-	for (i = pte_index(addr); i < PTRS_PER_PTE; i++, addr = next, pte++) {
-		next = (addr & PAGE_MASK) + PAGE_SIZE;
-		if (addr >= end) {
+	for (; i < PTRS_PER_PTE; i++, paddr = paddr_next, pte++) {
+		paddr_next = (paddr & PAGE_MASK) + PAGE_SIZE;
+		if (paddr >= paddr_end) {
 			if (!after_bootmem &&
-			    !e820_any_mapped(addr & PAGE_MASK, next, E820_RAM) &&
-			    !e820_any_mapped(addr & PAGE_MASK, next, E820_RESERVED_KERN))
+			    !e820_any_mapped(paddr & PAGE_MASK, paddr_next,
+					     E820_RAM) &&
+			    !e820_any_mapped(paddr & PAGE_MASK, paddr_next,
+					     E820_RESERVED_KERN))
 				set_pte(pte, __pte(0));
 			continue;
 		}
@@ -361,37 +369,44 @@ phys_pte_init(pte_t *pte_page, unsigned long addr, unsigned long end,
 		}
 
 		if (0)
-			printk("   pte=%p addr=%lx pte=%016lx\n",
-			       pte, addr, pfn_pte(addr >> PAGE_SHIFT, PAGE_KERNEL).pte);
+			pr_info("   pte=%p addr=%lx pte=%016lx\n", pte, paddr,
+				pfn_pte(paddr >> PAGE_SHIFT, PAGE_KERNEL).pte);
 		pages++;
-		set_pte(pte, pfn_pte(addr >> PAGE_SHIFT, prot));
-		last_map_addr = (addr & PAGE_MASK) + PAGE_SIZE;
+		set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot));
+		paddr_last = (paddr & PAGE_MASK) + PAGE_SIZE;
 	}
 
 	update_page_count(PG_LEVEL_4K, pages);
 
-	return last_map_addr;
+	return paddr_last;
 }
 
+/*
+ * Create PMD level page table mapping for physical addresses. The virtual
+ * and physical address have to be aligned at this level.
+ * It returns the last physical address mapped.
+ */
 static unsigned long __meminit
-phys_pmd_init(pmd_t *pmd_page, unsigned long address, unsigned long end,
+phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
 	      unsigned long page_size_mask, pgprot_t prot)
 {
-	unsigned long pages = 0, next;
-	unsigned long last_map_addr = end;
+	unsigned long pages = 0, paddr_next;
+	unsigned long paddr_last = paddr_end;
 
-	int i = pmd_index(address);
+	int i = pmd_index(paddr);
 
-	for (; i < PTRS_PER_PMD; i++, address = next) {
-		pmd_t *pmd = pmd_page + pmd_index(address);
+	for (; i < PTRS_PER_PMD; i++, paddr = paddr_next) {
+		pmd_t *pmd = pmd_page + pmd_index(paddr);
 		pte_t *pte;
 		pgprot_t new_prot = prot;
 
-		next = (address & PMD_MASK) + PMD_SIZE;
-		if (address >= end) {
+		paddr_next = (paddr & PMD_MASK) + PMD_SIZE;
+		if (paddr >= paddr_end) {
 			if (!after_bootmem &&
-			    !e820_any_mapped(address & PMD_MASK, next, E820_RAM) &&
-			    !e820_any_mapped(address & PMD_MASK, next, E820_RESERVED_KERN))
+			    !e820_any_mapped(paddr & PMD_MASK, paddr_next,
+					     E820_RAM) &&
+			    !e820_any_mapped(paddr & PMD_MASK, paddr_next,
+					     E820_RESERVED_KERN))
 				set_pmd(pmd, __pmd(0));
 			continue;
 		}
@@ -400,8 +415,8 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long address, unsigned long end,
 			if (!pmd_large(*pmd)) {
 				spin_lock(&init_mm.page_table_lock);
 				pte = (pte_t *)pmd_page_vaddr(*pmd);
-				last_map_addr = phys_pte_init(pte, address,
-								end, prot);
+				paddr_last = phys_pte_init(pte, paddr,
+							   paddr_end, prot);
 				spin_unlock(&init_mm.page_table_lock);
 				continue;
 			}
@@ -420,7 +435,7 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long address, unsigned long end,
 			if (page_size_mask & (1 << PG_LEVEL_2M)) {
 				if (!after_bootmem)
 					pages++;
-				last_map_addr = next;
+				paddr_last = paddr_next;
 				continue;
 			}
 			new_prot = pte_pgprot(pte_clrhuge(*(pte_t *)pmd));
@@ -430,42 +445,49 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long address, unsigned long end,
 			pages++;
 			spin_lock(&init_mm.page_table_lock);
 			set_pte((pte_t *)pmd,
-				pfn_pte((address & PMD_MASK) >> PAGE_SHIFT,
+				pfn_pte((paddr & PMD_MASK) >> PAGE_SHIFT,
 					__pgprot(pgprot_val(prot) | _PAGE_PSE)));
 			spin_unlock(&init_mm.page_table_lock);
-			last_map_addr = next;
+			paddr_last = paddr_next;
 			continue;
 		}
 
 		pte = alloc_low_page();
-		last_map_addr = phys_pte_init(pte, address, end, new_prot);
+		paddr_last = phys_pte_init(pte, paddr, paddr_end, new_prot);
 
 		spin_lock(&init_mm.page_table_lock);
 		pmd_populate_kernel(&init_mm, pmd, pte);
 		spin_unlock(&init_mm.page_table_lock);
 	}
 	update_page_count(PG_LEVEL_2M, pages);
-	return last_map_addr;
+	return paddr_last;
 }
 
+/*
+ * Create PUD level page table mapping for physical addresses. The virtual
+ * and physical address have to be aligned at this level.
+ * It returns the last physical address mapped.
+ */
 static unsigned long __meminit
-phys_pud_init(pud_t *pud_page, unsigned long addr, unsigned long end,
-			 unsigned long page_size_mask)
+phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
+	      unsigned long page_size_mask)
 {
-	unsigned long pages = 0, next;
-	unsigned long last_map_addr = end;
-	int i = pud_index(addr);
+	unsigned long pages = 0, paddr_next;
+	unsigned long paddr_last = paddr_end;
+	int i = pud_index(paddr);
 
-	for (; i < PTRS_PER_PUD; i++, addr = next) {
-		pud_t *pud = pud_page + pud_index(addr);
+	for (; i < PTRS_PER_PUD; i++, paddr = paddr_next) {
+		pud_t *pud = pud_page + pud_index(paddr);
 		pmd_t *pmd;
 		pgprot_t prot = PAGE_KERNEL;
 
-		next = (addr & PUD_MASK) + PUD_SIZE;
-		if (addr >= end) {
+		paddr_next = (paddr & PUD_MASK) + PUD_SIZE;
+		if (paddr >= paddr_end) {
 			if (!after_bootmem &&
-			    !e820_any_mapped(addr & PUD_MASK, next, E820_RAM) &&
-			    !e820_any_mapped(addr & PUD_MASK, next, E820_RESERVED_KERN))
+			    !e820_any_mapped(paddr & PUD_MASK, paddr_next,
+					     E820_RAM) &&
+			    !e820_any_mapped(paddr & PUD_MASK, paddr_next,
+					     E820_RESERVED_KERN))
 				set_pud(pud, __pud(0));
 			continue;
 		}
@@ -473,8 +495,10 @@ phys_pud_init(pud_t *pud_page, unsigned long addr, unsigned long end,
 		if (pud_val(*pud)) {
 			if (!pud_large(*pud)) {
 				pmd = pmd_offset(pud, 0);
-				last_map_addr = phys_pmd_init(pmd, addr, end,
-							 page_size_mask, prot);
+				paddr_last = phys_pmd_init(pmd, paddr,
+							   paddr_end,
+							   page_size_mask,
+							   prot);
 				__flush_tlb_all();
 				continue;
 			}
@@ -493,7 +517,7 @@ phys_pud_init(pud_t *pud_page, unsigned long addr, unsigned long end,
 			if (page_size_mask & (1 << PG_LEVEL_1G)) {
 				if (!after_bootmem)
 					pages++;
-				last_map_addr = next;
+				paddr_last = paddr_next;
 				continue;
 			}
 			prot = pte_pgprot(pte_clrhuge(*(pte_t *)pud));
@@ -503,16 +527,16 @@ phys_pud_init(pud_t *pud_page, unsigned long addr, unsigned long end,
 			pages++;
 			spin_lock(&init_mm.page_table_lock);
 			set_pte((pte_t *)pud,
-				pfn_pte((addr & PUD_MASK) >> PAGE_SHIFT,
+				pfn_pte((paddr & PUD_MASK) >> PAGE_SHIFT,
 					PAGE_KERNEL_LARGE));
 			spin_unlock(&init_mm.page_table_lock);
-			last_map_addr = next;
+			paddr_last = paddr_next;
 			continue;
 		}
 
 		pmd = alloc_low_page();
-		last_map_addr = phys_pmd_init(pmd, addr, end, page_size_mask,
-					      prot);
+		paddr_last = phys_pmd_init(pmd, paddr, paddr_end,
+					   page_size_mask, prot);
 
 		spin_lock(&init_mm.page_table_lock);
 		pud_populate(&init_mm, pud, pmd);
@@ -522,38 +546,44 @@ phys_pud_init(pud_t *pud_page, unsigned long addr, unsigned long end,
 
 	update_page_count(PG_LEVEL_1G, pages);
 
-	return last_map_addr;
+	return paddr_last;
 }
 
+/*
+ * Create page table mapping for the physical memory for specific physical
+ * addresses. The virtual and physical addresses have to be aligned on PUD level
+ * down. It returns the last physical address mapped.
+ */
 unsigned long __meminit
-kernel_physical_mapping_init(unsigned long start,
-			     unsigned long end,
+kernel_physical_mapping_init(unsigned long paddr_start,
+			     unsigned long paddr_end,
 			     unsigned long page_size_mask)
 {
 	bool pgd_changed = false;
-	unsigned long next, last_map_addr = end;
-	unsigned long addr;
+	unsigned long vaddr, vaddr_start, vaddr_end, vaddr_next, paddr_last;
 
-	start = (unsigned long)__va(start);
-	end = (unsigned long)__va(end);
-	addr = start;
+	paddr_last = paddr_end;
+	vaddr = (unsigned long)__va(paddr_start);
+	vaddr_end = (unsigned long)__va(paddr_end);
+	vaddr_start = vaddr;
 
-	for (; start < end; start = next) {
-		pgd_t *pgd = pgd_offset_k(start);
+	for (; vaddr < vaddr_end; vaddr = vaddr_next) {
+		pgd_t *pgd = pgd_offset_k(vaddr);
 		pud_t *pud;
 
-		next = (start & PGDIR_MASK) + PGDIR_SIZE;
+		vaddr_next = (vaddr & PGDIR_MASK) + PGDIR_SIZE;
 
 		if (pgd_val(*pgd)) {
 			pud = (pud_t *)pgd_page_vaddr(*pgd);
-			last_map_addr = phys_pud_init(pud, __pa(start),
-						 __pa(end), page_size_mask);
+			paddr_last = phys_pud_init(pud, __pa(vaddr),
+						   __pa(vaddr_end),
+						   page_size_mask);
 			continue;
 		}
 
 		pud = alloc_low_page();
-		last_map_addr = phys_pud_init(pud, __pa(start), __pa(end),
-						 page_size_mask);
+		paddr_last = phys_pud_init(pud, __pa(vaddr), __pa(vaddr_end),
+					   page_size_mask);
 
 		spin_lock(&init_mm.page_table_lock);
 		pgd_populate(&init_mm, pgd, pud);
@@ -562,11 +592,11 @@ kernel_physical_mapping_init(unsigned long start,
 	}
 
 	if (pgd_changed)
-		sync_global_pgds(addr, end - 1, 0);
+		sync_global_pgds(vaddr_start, vaddr_end - 1, 0);
 
 	__flush_tlb_all();
 
-	return last_map_addr;
+	return paddr_last;
 }
 
 #ifndef CONFIG_NUMA
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v7 3/9] x86/mm: PUD VA support for physical mapping (x86_64)
  2016-06-22  0:46 ` [kernel-hardening] " Kees Cook
@ 2016-06-22  0:47   ` Kees Cook
  -1 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

From: Thomas Garnier <thgarnie@google.com>

Minor change that allows early boot physical mapping of PUD level virtual
addresses. The current implementation expects the virtual address to be
PUD aligned. For KASLR memory randomization, we need to be able to
randomize the offset used on the PUD table.

It has no impact on current usage.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/mm/init_64.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 6714712bd5da..7bf1ddb54537 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -465,7 +465,8 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
 
 /*
  * Create PUD level page table mapping for physical addresses. The virtual
- * and physical address have to be aligned at this level.
+ * and physical address do not have to be aligned at this level. KASLR can
+ * randomize virtual addresses up to this level.
  * It returns the last physical address mapped.
  */
 static unsigned long __meminit
@@ -474,14 +475,18 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
 {
 	unsigned long pages = 0, paddr_next;
 	unsigned long paddr_last = paddr_end;
-	int i = pud_index(paddr);
+	unsigned long vaddr = (unsigned long)__va(paddr);
+	int i = pud_index(vaddr);
 
 	for (; i < PTRS_PER_PUD; i++, paddr = paddr_next) {
-		pud_t *pud = pud_page + pud_index(paddr);
+		pud_t *pud;
 		pmd_t *pmd;
 		pgprot_t prot = PAGE_KERNEL;
 
+		vaddr = (unsigned long)__va(paddr);
+		pud = pud_page + pud_index(vaddr);
 		paddr_next = (paddr & PUD_MASK) + PUD_SIZE;
+
 		if (paddr >= paddr_end) {
 			if (!after_bootmem &&
 			    !e820_any_mapped(paddr & PUD_MASK, paddr_next,
@@ -551,7 +556,7 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
 
 /*
  * Create page table mapping for the physical memory for specific physical
- * addresses. The virtual and physical addresses have to be aligned on PUD level
+ * addresses. The virtual and physical addresses have to be aligned on PMD level
  * down. It returns the last physical address mapped.
  */
 unsigned long __meminit
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [kernel-hardening] [PATCH v7 3/9] x86/mm: PUD VA support for physical mapping (x86_64)
@ 2016-06-22  0:47   ` Kees Cook
  0 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

From: Thomas Garnier <thgarnie@google.com>

Minor change that allows early boot physical mapping of PUD level virtual
addresses. The current implementation expects the virtual address to be
PUD aligned. For KASLR memory randomization, we need to be able to
randomize the offset used on the PUD table.

It has no impact on current usage.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/mm/init_64.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 6714712bd5da..7bf1ddb54537 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -465,7 +465,8 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
 
 /*
  * Create PUD level page table mapping for physical addresses. The virtual
- * and physical address have to be aligned at this level.
+ * and physical address do not have to be aligned at this level. KASLR can
+ * randomize virtual addresses up to this level.
  * It returns the last physical address mapped.
  */
 static unsigned long __meminit
@@ -474,14 +475,18 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
 {
 	unsigned long pages = 0, paddr_next;
 	unsigned long paddr_last = paddr_end;
-	int i = pud_index(paddr);
+	unsigned long vaddr = (unsigned long)__va(paddr);
+	int i = pud_index(vaddr);
 
 	for (; i < PTRS_PER_PUD; i++, paddr = paddr_next) {
-		pud_t *pud = pud_page + pud_index(paddr);
+		pud_t *pud;
 		pmd_t *pmd;
 		pgprot_t prot = PAGE_KERNEL;
 
+		vaddr = (unsigned long)__va(paddr);
+		pud = pud_page + pud_index(vaddr);
 		paddr_next = (paddr & PUD_MASK) + PUD_SIZE;
+
 		if (paddr >= paddr_end) {
 			if (!after_bootmem &&
 			    !e820_any_mapped(paddr & PUD_MASK, paddr_next,
@@ -551,7 +556,7 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
 
 /*
  * Create page table mapping for the physical memory for specific physical
- * addresses. The virtual and physical addresses have to be aligned on PUD level
+ * addresses. The virtual and physical addresses have to be aligned on PMD level
  * down. It returns the last physical address mapped.
  */
 unsigned long __meminit
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v7 4/9] x86/mm: Separate variable for trampoline PGD (x86_64)
  2016-06-22  0:46 ` [kernel-hardening] " Kees Cook
@ 2016-06-22  0:47   ` Kees Cook
  -1 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

From: Thomas Garnier <thgarnie@google.com>

Use a separate global variable to define the trampoline PGD used to
start other processors. This change will allow KALSR memory
randomization to change the trampoline PGD to be correctly aligned with
physical memory.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/include/asm/pgtable.h | 12 ++++++++++++
 arch/x86/mm/init.c             |  3 +++
 arch/x86/realmode/init.c       |  5 ++++-
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 1a27396b6ea0..d455bef39e9c 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -729,6 +729,18 @@ extern int direct_gbpages;
 void init_mem_mapping(void);
 void early_alloc_pgt_buf(void);
 
+#ifdef CONFIG_X86_64
+/* Realmode trampoline initialization. */
+extern pgd_t trampoline_pgd_entry;
+static inline void __meminit init_trampoline(void)
+{
+	/* Default trampoline pgd value */
+	trampoline_pgd_entry = init_level4_pgt[pgd_index(__PAGE_OFFSET)];
+}
+#else
+static inline void init_trampoline(void) { }
+#endif
+
 /* local pte updates need not use xchg for locking */
 static inline pte_t native_local_ptep_get_and_clear(pte_t *ptep)
 {
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 372aad2b3291..4252acdfcbbd 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -590,6 +590,9 @@ void __init init_mem_mapping(void)
 	/* the ISA range is always mapped regardless of memory holes */
 	init_memory_mapping(0, ISA_END_ADDRESS);
 
+	/* Init the trampoline, possibly with KASLR memory offset */
+	init_trampoline();
+
 	/*
 	 * If the allocation is in bottom-up direction, we setup direct mapping
 	 * in bottom-up, otherwise we setup direct mapping in top-down.
diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 0b7a63d98440..705e3fffb4a1 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -8,6 +8,9 @@
 struct real_mode_header *real_mode_header;
 u32 *trampoline_cr4_features;
 
+/* Hold the pgd entry used on booting additional CPUs */
+pgd_t trampoline_pgd_entry;
+
 void __init reserve_real_mode(void)
 {
 	phys_addr_t mem;
@@ -84,7 +87,7 @@ void __init setup_real_mode(void)
 	*trampoline_cr4_features = __read_cr4();
 
 	trampoline_pgd = (u64 *) __va(real_mode_header->trampoline_pgd);
-	trampoline_pgd[0] = init_level4_pgt[pgd_index(__PAGE_OFFSET)].pgd;
+	trampoline_pgd[0] = trampoline_pgd_entry.pgd;
 	trampoline_pgd[511] = init_level4_pgt[511].pgd;
 #endif
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [kernel-hardening] [PATCH v7 4/9] x86/mm: Separate variable for trampoline PGD (x86_64)
@ 2016-06-22  0:47   ` Kees Cook
  0 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

From: Thomas Garnier <thgarnie@google.com>

Use a separate global variable to define the trampoline PGD used to
start other processors. This change will allow KALSR memory
randomization to change the trampoline PGD to be correctly aligned with
physical memory.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/include/asm/pgtable.h | 12 ++++++++++++
 arch/x86/mm/init.c             |  3 +++
 arch/x86/realmode/init.c       |  5 ++++-
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 1a27396b6ea0..d455bef39e9c 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -729,6 +729,18 @@ extern int direct_gbpages;
 void init_mem_mapping(void);
 void early_alloc_pgt_buf(void);
 
+#ifdef CONFIG_X86_64
+/* Realmode trampoline initialization. */
+extern pgd_t trampoline_pgd_entry;
+static inline void __meminit init_trampoline(void)
+{
+	/* Default trampoline pgd value */
+	trampoline_pgd_entry = init_level4_pgt[pgd_index(__PAGE_OFFSET)];
+}
+#else
+static inline void init_trampoline(void) { }
+#endif
+
 /* local pte updates need not use xchg for locking */
 static inline pte_t native_local_ptep_get_and_clear(pte_t *ptep)
 {
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 372aad2b3291..4252acdfcbbd 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -590,6 +590,9 @@ void __init init_mem_mapping(void)
 	/* the ISA range is always mapped regardless of memory holes */
 	init_memory_mapping(0, ISA_END_ADDRESS);
 
+	/* Init the trampoline, possibly with KASLR memory offset */
+	init_trampoline();
+
 	/*
 	 * If the allocation is in bottom-up direction, we setup direct mapping
 	 * in bottom-up, otherwise we setup direct mapping in top-down.
diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 0b7a63d98440..705e3fffb4a1 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -8,6 +8,9 @@
 struct real_mode_header *real_mode_header;
 u32 *trampoline_cr4_features;
 
+/* Hold the pgd entry used on booting additional CPUs */
+pgd_t trampoline_pgd_entry;
+
 void __init reserve_real_mode(void)
 {
 	phys_addr_t mem;
@@ -84,7 +87,7 @@ void __init setup_real_mode(void)
 	*trampoline_cr4_features = __read_cr4();
 
 	trampoline_pgd = (u64 *) __va(real_mode_header->trampoline_pgd);
-	trampoline_pgd[0] = init_level4_pgt[pgd_index(__PAGE_OFFSET)].pgd;
+	trampoline_pgd[0] = trampoline_pgd_entry.pgd;
 	trampoline_pgd[511] = init_level4_pgt[511].pgd;
 #endif
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v7 5/9] x86/mm: Implement ASLR for kernel memory regions (x86_64)
  2016-06-22  0:46 ` [kernel-hardening] " Kees Cook
@ 2016-06-22  0:47   ` Kees Cook
  -1 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

From: Thomas Garnier <thgarnie@google.com>

Randomizes the virtual address space of kernel memory regions for
x86_64. This first patch adds the infrastructure and does not randomize
any region. The following patches will randomize the physical memory
mapping, vmalloc and vmemmap regions.

This security feature mitigates exploits relying on predictable kernel
addresses. These addresses can be used to disclose the kernel modules
base addresses or corrupt specific structures to elevate privileges
bypassing the current implementation of KASLR. This feature can be
enabled with the CONFIG_RANDOMIZE_MEMORY option.

The order of each memory region is not changed. The feature looks at the
available space for the regions based on different configuration options
and randomizes the base and space between each. The size of the physical
memory mapping is the available physical memory. No performance impact
was detected while testing the feature.

Entropy is generated using the KASLR early boot functions now shared in
the lib directory (originally written by Kees Cook). Randomization is
done on PGD & PUD page table levels to increase possible addresses. The
physical memory mapping code was adapted to support PUD level virtual
addresses. This implementation on the best configuration provides 30,000
possible virtual addresses in average for each memory region.  An
additional low memory page is used to ensure each CPU can start with a
PGD aligned virtual address (for realmode).

x86/dump_pagetable was updated to correctly display each region.

Updated documentation on x86_64 memory layout accordingly.

Performance data, after all patches in the series:

Kernbench shows almost no difference (-+ less than 1%):

Before:

Average Optimal load -j 12 Run (std deviation): Elapsed Time 102.63 (1.2695)
User Time 1034.89 (1.18115) System Time 87.056 (0.456416) Percent CPU 1092.9
(13.892) Context Switches 199805 (3455.33) Sleeps 97907.8 (900.636)

After:

Average Optimal load -j 12 Run (std deviation): Elapsed Time 102.489 (1.10636)
User Time 1034.86 (1.36053) System Time 87.764 (0.49345) Percent CPU 1095
(12.7715) Context Switches 199036 (4298.1) Sleeps 97681.6 (1031.11)

Hackbench shows 0% difference on average (hackbench 90 repeated 10 times):

attemp,before,after 1,0.076,0.069 2,0.072,0.069 3,0.066,0.066 4,0.066,0.068
5,0.066,0.067 6,0.066,0.069 7,0.067,0.066 8,0.063,0.067 9,0.067,0.065
10,0.068,0.071 average,0.0677,0.0677

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 Documentation/x86/x86_64/mm.txt |   4 ++
 arch/x86/Kconfig                |  17 +++++
 arch/x86/include/asm/kaslr.h    |   6 ++
 arch/x86/include/asm/pgtable.h  |   7 +-
 arch/x86/kernel/setup.c         |   3 +
 arch/x86/mm/Makefile            |   1 +
 arch/x86/mm/dump_pagetables.c   |  16 +++--
 arch/x86/mm/init.c              |   1 +
 arch/x86/mm/kaslr.c             | 152 ++++++++++++++++++++++++++++++++++++++++
 9 files changed, 202 insertions(+), 5 deletions(-)
 create mode 100644 arch/x86/mm/kaslr.c

diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index 5aa738346062..8c7dd5957ae1 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -39,4 +39,8 @@ memory window (this size is arbitrary, it can be raised later if needed).
 The mappings are not part of any other kernel PGD and are only available
 during EFI runtime calls.
 
+Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all
+physical memory, vmalloc/ioremap space and virtual memory map are randomized.
+Their order is preserved but their base will be offset early at boot time.
+
 -Andi Kleen, Jul 2004
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 770ae5259dff..adab3fef3bb4 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1993,6 +1993,23 @@ config PHYSICAL_ALIGN
 
 	  Don't change this unless you know what you are doing.
 
+config RANDOMIZE_MEMORY
+	bool "Randomize the kernel memory sections"
+	depends on X86_64
+	depends on RANDOMIZE_BASE
+	default RANDOMIZE_BASE
+	---help---
+	   Randomizes the base virtual address of kernel memory sections
+	   (physical memory mapping, vmalloc & vmemmap). This security feature
+	   makes exploits relying on predictable memory locations less reliable.
+
+	   The order of allocations remains unchanged. Entropy is generated in
+	   the same way as RANDOMIZE_BASE. Current implementation in the optimal
+	   configuration have in average 30,000 different possible virtual
+	   addresses for each memory section.
+
+	   If unsure, say N.
+
 config HOTPLUG_CPU
 	bool "Support for hot-pluggable CPUs"
 	depends on SMP
diff --git a/arch/x86/include/asm/kaslr.h b/arch/x86/include/asm/kaslr.h
index 5547438db5ea..683c9d736314 100644
--- a/arch/x86/include/asm/kaslr.h
+++ b/arch/x86/include/asm/kaslr.h
@@ -3,4 +3,10 @@
 
 unsigned long kaslr_get_random_long(const char *purpose);
 
+#ifdef CONFIG_RANDOMIZE_MEMORY
+void kernel_randomize_memory(void);
+#else
+static inline void kernel_randomize_memory(void) { }
+#endif /* CONFIG_RANDOMIZE_MEMORY */
+
 #endif
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index d455bef39e9c..5472682a307f 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -732,11 +732,16 @@ void early_alloc_pgt_buf(void);
 #ifdef CONFIG_X86_64
 /* Realmode trampoline initialization. */
 extern pgd_t trampoline_pgd_entry;
-static inline void __meminit init_trampoline(void)
+static inline void __meminit init_trampoline_default(void)
 {
 	/* Default trampoline pgd value */
 	trampoline_pgd_entry = init_level4_pgt[pgd_index(__PAGE_OFFSET)];
 }
+# ifdef CONFIG_RANDOMIZE_MEMORY
+void __meminit init_trampoline(void);
+# else
+#  define init_trampoline init_trampoline_default
+# endif
 #else
 static inline void init_trampoline(void) { }
 #endif
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index c4e7b3991b60..a2616584b6e9 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -113,6 +113,7 @@
 #include <asm/prom.h>
 #include <asm/microcode.h>
 #include <asm/mmu_context.h>
+#include <asm/kaslr.h>
 
 /*
  * max_low_pfn_mapped: highest direct mapped pfn under 4GB
@@ -942,6 +943,8 @@ void __init setup_arch(char **cmdline_p)
 
 	x86_init.oem.arch_setup();
 
+	kernel_randomize_memory();
+
 	iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1;
 	setup_memory_map();
 	parse_setup_data();
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 62c0043a5fd5..96d2b847e09e 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -37,4 +37,5 @@ obj-$(CONFIG_NUMA_EMU)		+= numa_emulation.o
 
 obj-$(CONFIG_X86_INTEL_MPX)	+= mpx.o
 obj-$(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) += pkeys.o
+obj-$(CONFIG_RANDOMIZE_MEMORY) += kaslr.o
 
diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index 99bfb192803f..9a17250bcbe0 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -72,9 +72,9 @@ static struct addr_marker address_markers[] = {
 	{ 0, "User Space" },
 #ifdef CONFIG_X86_64
 	{ 0x8000000000000000UL, "Kernel Space" },
-	{ PAGE_OFFSET,		"Low Kernel Mapping" },
-	{ VMALLOC_START,        "vmalloc() Area" },
-	{ VMEMMAP_START,        "Vmemmap" },
+	{ 0/* PAGE_OFFSET */,   "Low Kernel Mapping" },
+	{ 0/* VMALLOC_START */, "vmalloc() Area" },
+	{ 0/* VMEMMAP_START */, "Vmemmap" },
 # ifdef CONFIG_X86_ESPFIX64
 	{ ESPFIX_BASE_ADDR,	"ESPfix Area", 16 },
 # endif
@@ -434,8 +434,16 @@ void ptdump_walk_pgd_level_checkwx(void)
 
 static int __init pt_dump_init(void)
 {
+	/*
+	 * Various markers are not compile-time constants, so assign them
+	 * here.
+	 */
+#ifdef CONFIG_X86_64
+	address_markers[LOW_KERNEL_NR].start_address = PAGE_OFFSET;
+	address_markers[VMALLOC_START_NR].start_address = VMALLOC_START;
+	address_markers[VMEMMAP_START_NR].start_address = VMEMMAP_START;
+#endif
 #ifdef CONFIG_X86_32
-	/* Not a compile-time constant on x86-32 */
 	address_markers[VMALLOC_START_NR].start_address = VMALLOC_START;
 	address_markers[VMALLOC_END_NR].start_address = VMALLOC_END;
 # ifdef CONFIG_HIGHMEM
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 4252acdfcbbd..cc82830bc8c4 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -17,6 +17,7 @@
 #include <asm/proto.h>
 #include <asm/dma.h>		/* for MAX_DMA_PFN */
 #include <asm/microcode.h>
+#include <asm/kaslr.h>
 
 /*
  * We need to define the tracepoints somewhere, and tlb.c
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
new file mode 100644
index 000000000000..d5380a48e8fb
--- /dev/null
+++ b/arch/x86/mm/kaslr.c
@@ -0,0 +1,152 @@
+/*
+ * This file implements KASLR memory randomization for x86_64. It randomizes
+ * the virtual address space of kernel memory regions (physical memory
+ * mapping, vmalloc & vmemmap) for x86_64. This security feature mitigates
+ * exploits relying on predictable kernel addresses.
+ *
+ * Entropy is generated using the KASLR early boot functions now shared in
+ * the lib directory (originally written by Kees Cook). Randomization is
+ * done on PGD & PUD page table levels to increase possible addresses. The
+ * physical memory mapping code was adapted to support PUD level virtual
+ * addresses. This implementation on the best configuration provides 30,000
+ * possible virtual addresses in average for each memory region. An additional
+ * low memory page is used to ensure each CPU can start with a PGD aligned
+ * virtual address (for realmode).
+ *
+ * The order of each memory region is not changed. The feature looks at
+ * the available space for the regions based on different configuration
+ * options and randomizes the base and space between each. The size of the
+ * physical memory mapping is the available physical memory.
+ */
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/random.h>
+
+#include <asm/pgalloc.h>
+#include <asm/pgtable.h>
+#include <asm/setup.h>
+#include <asm/kaslr.h>
+
+#include "mm_internal.h"
+
+#define TB_SHIFT 40
+
+/*
+ * Virtual address start and end range for randomization. The end changes base
+ * on configuration to have the highest amount of space for randomization.
+ * It increases the possible random position for each randomized region.
+ *
+ * You need to add an if/def entry if you introduce a new memory region
+ * compatible with KASLR. Your entry must be in logical order with memory
+ * layout. For example, ESPFIX is before EFI because its virtual address is
+ * before. You also need to add a BUILD_BUG_ON in kernel_randomize_memory to
+ * ensure that this order is correct and won't be changed.
+ */
+static const unsigned long vaddr_start;
+static const unsigned long vaddr_end;
+
+/*
+ * Memory regions randomized by KASLR (except modules that use a separate logic
+ * earlier during boot). The list is ordered based on virtual addresses. This
+ * order is kept after randomization.
+ */
+static __initdata struct kaslr_memory_region {
+	unsigned long *base;
+	unsigned long size_tb;
+} kaslr_regions[] = {
+};
+
+/* Get size in bytes used by the memory region */
+static inline unsigned long get_padding(struct kaslr_memory_region *region)
+{
+	return (region->size_tb << TB_SHIFT);
+}
+
+/*
+ * Apply no randomization if KASLR was disabled at boot or if KASAN
+ * is enabled. KASAN shadow mappings rely on regions being PGD aligned.
+ */
+static inline bool kaslr_memory_enabled(void)
+{
+	return kaslr_enabled() && !config_enabled(CONFIG_KASAN);
+}
+
+/* Initialize base and padding for each memory region randomized with KASLR */
+void __init kernel_randomize_memory(void)
+{
+	size_t i;
+	unsigned long vaddr = vaddr_start;
+	unsigned long rand;
+	struct rnd_state rand_state;
+	unsigned long remain_entropy;
+
+	if (!kaslr_memory_enabled())
+		return;
+
+	/* Calculate entropy available between regions */
+	remain_entropy = vaddr_end - vaddr_start;
+	for (i = 0; i < ARRAY_SIZE(kaslr_regions); i++)
+		remain_entropy -= get_padding(&kaslr_regions[i]);
+
+	prandom_seed_state(&rand_state, kaslr_get_random_long("Memory"));
+
+	for (i = 0; i < ARRAY_SIZE(kaslr_regions); i++) {
+		unsigned long entropy;
+
+		/*
+		 * Select a random virtual address using the extra entropy
+		 * available.
+		 */
+		entropy = remain_entropy / (ARRAY_SIZE(kaslr_regions) - i);
+		prandom_bytes_state(&rand_state, &rand, sizeof(rand));
+		entropy = (rand % (entropy + 1)) & PUD_MASK;
+		vaddr += entropy;
+		*kaslr_regions[i].base = vaddr;
+
+		/*
+		 * Jump the region and add a minimum padding based on
+		 * randomization alignment.
+		 */
+		vaddr += get_padding(&kaslr_regions[i]);
+		vaddr = round_up(vaddr + 1, PUD_SIZE);
+		remain_entropy -= entropy;
+	}
+}
+
+/*
+ * Create PGD aligned trampoline table to allow real mode initialization
+ * of additional CPUs. Consume only 1 low memory page.
+ */
+void __meminit init_trampoline(void)
+{
+	unsigned long paddr, paddr_next;
+	pgd_t *pgd;
+	pud_t *pud_page, *pud_page_tramp;
+	int i;
+
+	if (!kaslr_memory_enabled()) {
+		init_trampoline_default();
+		return;
+	}
+
+	pud_page_tramp = alloc_low_page();
+
+	paddr = 0;
+	pgd = pgd_offset_k((unsigned long)__va(paddr));
+	pud_page = (pud_t *) pgd_page_vaddr(*pgd);
+
+	for (i = pud_index(paddr); i < PTRS_PER_PUD; i++, paddr = paddr_next) {
+		pud_t *pud, *pud_tramp;
+		unsigned long vaddr = (unsigned long)__va(paddr);
+
+		pud_tramp = pud_page_tramp + pud_index(paddr);
+		pud = pud_page + pud_index(vaddr);
+		paddr_next = (paddr & PUD_MASK) + PUD_SIZE;
+
+		*pud_tramp = *pud;
+	}
+
+	set_pgd(&trampoline_pgd_entry,
+		__pgd(_KERNPG_TABLE | __pa(pud_page_tramp)));
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [kernel-hardening] [PATCH v7 5/9] x86/mm: Implement ASLR for kernel memory regions (x86_64)
@ 2016-06-22  0:47   ` Kees Cook
  0 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

From: Thomas Garnier <thgarnie@google.com>

Randomizes the virtual address space of kernel memory regions for
x86_64. This first patch adds the infrastructure and does not randomize
any region. The following patches will randomize the physical memory
mapping, vmalloc and vmemmap regions.

This security feature mitigates exploits relying on predictable kernel
addresses. These addresses can be used to disclose the kernel modules
base addresses or corrupt specific structures to elevate privileges
bypassing the current implementation of KASLR. This feature can be
enabled with the CONFIG_RANDOMIZE_MEMORY option.

The order of each memory region is not changed. The feature looks at the
available space for the regions based on different configuration options
and randomizes the base and space between each. The size of the physical
memory mapping is the available physical memory. No performance impact
was detected while testing the feature.

Entropy is generated using the KASLR early boot functions now shared in
the lib directory (originally written by Kees Cook). Randomization is
done on PGD & PUD page table levels to increase possible addresses. The
physical memory mapping code was adapted to support PUD level virtual
addresses. This implementation on the best configuration provides 30,000
possible virtual addresses in average for each memory region.  An
additional low memory page is used to ensure each CPU can start with a
PGD aligned virtual address (for realmode).

x86/dump_pagetable was updated to correctly display each region.

Updated documentation on x86_64 memory layout accordingly.

Performance data, after all patches in the series:

Kernbench shows almost no difference (-+ less than 1%):

Before:

Average Optimal load -j 12 Run (std deviation): Elapsed Time 102.63 (1.2695)
User Time 1034.89 (1.18115) System Time 87.056 (0.456416) Percent CPU 1092.9
(13.892) Context Switches 199805 (3455.33) Sleeps 97907.8 (900.636)

After:

Average Optimal load -j 12 Run (std deviation): Elapsed Time 102.489 (1.10636)
User Time 1034.86 (1.36053) System Time 87.764 (0.49345) Percent CPU 1095
(12.7715) Context Switches 199036 (4298.1) Sleeps 97681.6 (1031.11)

Hackbench shows 0% difference on average (hackbench 90 repeated 10 times):

attemp,before,after 1,0.076,0.069 2,0.072,0.069 3,0.066,0.066 4,0.066,0.068
5,0.066,0.067 6,0.066,0.069 7,0.067,0.066 8,0.063,0.067 9,0.067,0.065
10,0.068,0.071 average,0.0677,0.0677

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 Documentation/x86/x86_64/mm.txt |   4 ++
 arch/x86/Kconfig                |  17 +++++
 arch/x86/include/asm/kaslr.h    |   6 ++
 arch/x86/include/asm/pgtable.h  |   7 +-
 arch/x86/kernel/setup.c         |   3 +
 arch/x86/mm/Makefile            |   1 +
 arch/x86/mm/dump_pagetables.c   |  16 +++--
 arch/x86/mm/init.c              |   1 +
 arch/x86/mm/kaslr.c             | 152 ++++++++++++++++++++++++++++++++++++++++
 9 files changed, 202 insertions(+), 5 deletions(-)
 create mode 100644 arch/x86/mm/kaslr.c

diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index 5aa738346062..8c7dd5957ae1 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -39,4 +39,8 @@ memory window (this size is arbitrary, it can be raised later if needed).
 The mappings are not part of any other kernel PGD and are only available
 during EFI runtime calls.
 
+Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all
+physical memory, vmalloc/ioremap space and virtual memory map are randomized.
+Their order is preserved but their base will be offset early at boot time.
+
 -Andi Kleen, Jul 2004
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 770ae5259dff..adab3fef3bb4 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1993,6 +1993,23 @@ config PHYSICAL_ALIGN
 
 	  Don't change this unless you know what you are doing.
 
+config RANDOMIZE_MEMORY
+	bool "Randomize the kernel memory sections"
+	depends on X86_64
+	depends on RANDOMIZE_BASE
+	default RANDOMIZE_BASE
+	---help---
+	   Randomizes the base virtual address of kernel memory sections
+	   (physical memory mapping, vmalloc & vmemmap). This security feature
+	   makes exploits relying on predictable memory locations less reliable.
+
+	   The order of allocations remains unchanged. Entropy is generated in
+	   the same way as RANDOMIZE_BASE. Current implementation in the optimal
+	   configuration have in average 30,000 different possible virtual
+	   addresses for each memory section.
+
+	   If unsure, say N.
+
 config HOTPLUG_CPU
 	bool "Support for hot-pluggable CPUs"
 	depends on SMP
diff --git a/arch/x86/include/asm/kaslr.h b/arch/x86/include/asm/kaslr.h
index 5547438db5ea..683c9d736314 100644
--- a/arch/x86/include/asm/kaslr.h
+++ b/arch/x86/include/asm/kaslr.h
@@ -3,4 +3,10 @@
 
 unsigned long kaslr_get_random_long(const char *purpose);
 
+#ifdef CONFIG_RANDOMIZE_MEMORY
+void kernel_randomize_memory(void);
+#else
+static inline void kernel_randomize_memory(void) { }
+#endif /* CONFIG_RANDOMIZE_MEMORY */
+
 #endif
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index d455bef39e9c..5472682a307f 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -732,11 +732,16 @@ void early_alloc_pgt_buf(void);
 #ifdef CONFIG_X86_64
 /* Realmode trampoline initialization. */
 extern pgd_t trampoline_pgd_entry;
-static inline void __meminit init_trampoline(void)
+static inline void __meminit init_trampoline_default(void)
 {
 	/* Default trampoline pgd value */
 	trampoline_pgd_entry = init_level4_pgt[pgd_index(__PAGE_OFFSET)];
 }
+# ifdef CONFIG_RANDOMIZE_MEMORY
+void __meminit init_trampoline(void);
+# else
+#  define init_trampoline init_trampoline_default
+# endif
 #else
 static inline void init_trampoline(void) { }
 #endif
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index c4e7b3991b60..a2616584b6e9 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -113,6 +113,7 @@
 #include <asm/prom.h>
 #include <asm/microcode.h>
 #include <asm/mmu_context.h>
+#include <asm/kaslr.h>
 
 /*
  * max_low_pfn_mapped: highest direct mapped pfn under 4GB
@@ -942,6 +943,8 @@ void __init setup_arch(char **cmdline_p)
 
 	x86_init.oem.arch_setup();
 
+	kernel_randomize_memory();
+
 	iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1;
 	setup_memory_map();
 	parse_setup_data();
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 62c0043a5fd5..96d2b847e09e 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -37,4 +37,5 @@ obj-$(CONFIG_NUMA_EMU)		+= numa_emulation.o
 
 obj-$(CONFIG_X86_INTEL_MPX)	+= mpx.o
 obj-$(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) += pkeys.o
+obj-$(CONFIG_RANDOMIZE_MEMORY) += kaslr.o
 
diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index 99bfb192803f..9a17250bcbe0 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -72,9 +72,9 @@ static struct addr_marker address_markers[] = {
 	{ 0, "User Space" },
 #ifdef CONFIG_X86_64
 	{ 0x8000000000000000UL, "Kernel Space" },
-	{ PAGE_OFFSET,		"Low Kernel Mapping" },
-	{ VMALLOC_START,        "vmalloc() Area" },
-	{ VMEMMAP_START,        "Vmemmap" },
+	{ 0/* PAGE_OFFSET */,   "Low Kernel Mapping" },
+	{ 0/* VMALLOC_START */, "vmalloc() Area" },
+	{ 0/* VMEMMAP_START */, "Vmemmap" },
 # ifdef CONFIG_X86_ESPFIX64
 	{ ESPFIX_BASE_ADDR,	"ESPfix Area", 16 },
 # endif
@@ -434,8 +434,16 @@ void ptdump_walk_pgd_level_checkwx(void)
 
 static int __init pt_dump_init(void)
 {
+	/*
+	 * Various markers are not compile-time constants, so assign them
+	 * here.
+	 */
+#ifdef CONFIG_X86_64
+	address_markers[LOW_KERNEL_NR].start_address = PAGE_OFFSET;
+	address_markers[VMALLOC_START_NR].start_address = VMALLOC_START;
+	address_markers[VMEMMAP_START_NR].start_address = VMEMMAP_START;
+#endif
 #ifdef CONFIG_X86_32
-	/* Not a compile-time constant on x86-32 */
 	address_markers[VMALLOC_START_NR].start_address = VMALLOC_START;
 	address_markers[VMALLOC_END_NR].start_address = VMALLOC_END;
 # ifdef CONFIG_HIGHMEM
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 4252acdfcbbd..cc82830bc8c4 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -17,6 +17,7 @@
 #include <asm/proto.h>
 #include <asm/dma.h>		/* for MAX_DMA_PFN */
 #include <asm/microcode.h>
+#include <asm/kaslr.h>
 
 /*
  * We need to define the tracepoints somewhere, and tlb.c
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
new file mode 100644
index 000000000000..d5380a48e8fb
--- /dev/null
+++ b/arch/x86/mm/kaslr.c
@@ -0,0 +1,152 @@
+/*
+ * This file implements KASLR memory randomization for x86_64. It randomizes
+ * the virtual address space of kernel memory regions (physical memory
+ * mapping, vmalloc & vmemmap) for x86_64. This security feature mitigates
+ * exploits relying on predictable kernel addresses.
+ *
+ * Entropy is generated using the KASLR early boot functions now shared in
+ * the lib directory (originally written by Kees Cook). Randomization is
+ * done on PGD & PUD page table levels to increase possible addresses. The
+ * physical memory mapping code was adapted to support PUD level virtual
+ * addresses. This implementation on the best configuration provides 30,000
+ * possible virtual addresses in average for each memory region. An additional
+ * low memory page is used to ensure each CPU can start with a PGD aligned
+ * virtual address (for realmode).
+ *
+ * The order of each memory region is not changed. The feature looks at
+ * the available space for the regions based on different configuration
+ * options and randomizes the base and space between each. The size of the
+ * physical memory mapping is the available physical memory.
+ */
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/random.h>
+
+#include <asm/pgalloc.h>
+#include <asm/pgtable.h>
+#include <asm/setup.h>
+#include <asm/kaslr.h>
+
+#include "mm_internal.h"
+
+#define TB_SHIFT 40
+
+/*
+ * Virtual address start and end range for randomization. The end changes base
+ * on configuration to have the highest amount of space for randomization.
+ * It increases the possible random position for each randomized region.
+ *
+ * You need to add an if/def entry if you introduce a new memory region
+ * compatible with KASLR. Your entry must be in logical order with memory
+ * layout. For example, ESPFIX is before EFI because its virtual address is
+ * before. You also need to add a BUILD_BUG_ON in kernel_randomize_memory to
+ * ensure that this order is correct and won't be changed.
+ */
+static const unsigned long vaddr_start;
+static const unsigned long vaddr_end;
+
+/*
+ * Memory regions randomized by KASLR (except modules that use a separate logic
+ * earlier during boot). The list is ordered based on virtual addresses. This
+ * order is kept after randomization.
+ */
+static __initdata struct kaslr_memory_region {
+	unsigned long *base;
+	unsigned long size_tb;
+} kaslr_regions[] = {
+};
+
+/* Get size in bytes used by the memory region */
+static inline unsigned long get_padding(struct kaslr_memory_region *region)
+{
+	return (region->size_tb << TB_SHIFT);
+}
+
+/*
+ * Apply no randomization if KASLR was disabled at boot or if KASAN
+ * is enabled. KASAN shadow mappings rely on regions being PGD aligned.
+ */
+static inline bool kaslr_memory_enabled(void)
+{
+	return kaslr_enabled() && !config_enabled(CONFIG_KASAN);
+}
+
+/* Initialize base and padding for each memory region randomized with KASLR */
+void __init kernel_randomize_memory(void)
+{
+	size_t i;
+	unsigned long vaddr = vaddr_start;
+	unsigned long rand;
+	struct rnd_state rand_state;
+	unsigned long remain_entropy;
+
+	if (!kaslr_memory_enabled())
+		return;
+
+	/* Calculate entropy available between regions */
+	remain_entropy = vaddr_end - vaddr_start;
+	for (i = 0; i < ARRAY_SIZE(kaslr_regions); i++)
+		remain_entropy -= get_padding(&kaslr_regions[i]);
+
+	prandom_seed_state(&rand_state, kaslr_get_random_long("Memory"));
+
+	for (i = 0; i < ARRAY_SIZE(kaslr_regions); i++) {
+		unsigned long entropy;
+
+		/*
+		 * Select a random virtual address using the extra entropy
+		 * available.
+		 */
+		entropy = remain_entropy / (ARRAY_SIZE(kaslr_regions) - i);
+		prandom_bytes_state(&rand_state, &rand, sizeof(rand));
+		entropy = (rand % (entropy + 1)) & PUD_MASK;
+		vaddr += entropy;
+		*kaslr_regions[i].base = vaddr;
+
+		/*
+		 * Jump the region and add a minimum padding based on
+		 * randomization alignment.
+		 */
+		vaddr += get_padding(&kaslr_regions[i]);
+		vaddr = round_up(vaddr + 1, PUD_SIZE);
+		remain_entropy -= entropy;
+	}
+}
+
+/*
+ * Create PGD aligned trampoline table to allow real mode initialization
+ * of additional CPUs. Consume only 1 low memory page.
+ */
+void __meminit init_trampoline(void)
+{
+	unsigned long paddr, paddr_next;
+	pgd_t *pgd;
+	pud_t *pud_page, *pud_page_tramp;
+	int i;
+
+	if (!kaslr_memory_enabled()) {
+		init_trampoline_default();
+		return;
+	}
+
+	pud_page_tramp = alloc_low_page();
+
+	paddr = 0;
+	pgd = pgd_offset_k((unsigned long)__va(paddr));
+	pud_page = (pud_t *) pgd_page_vaddr(*pgd);
+
+	for (i = pud_index(paddr); i < PTRS_PER_PUD; i++, paddr = paddr_next) {
+		pud_t *pud, *pud_tramp;
+		unsigned long vaddr = (unsigned long)__va(paddr);
+
+		pud_tramp = pud_page_tramp + pud_index(paddr);
+		pud = pud_page + pud_index(vaddr);
+		paddr_next = (paddr & PUD_MASK) + PUD_SIZE;
+
+		*pud_tramp = *pud;
+	}
+
+	set_pgd(&trampoline_pgd_entry,
+		__pgd(_KERNPG_TABLE | __pa(pud_page_tramp)));
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v7 6/9] x86/mm: Enable KASLR for physical mapping memory region (x86_64)
  2016-06-22  0:46 ` [kernel-hardening] " Kees Cook
@ 2016-06-22  0:47   ` Kees Cook
  -1 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

From: Thomas Garnier <thgarnie@google.com>

Add the physical mapping in the list of randomized memory regions.

The physical memory mapping holds most allocations from boot and heap
allocators. Knowing the base address and physical memory size, an attacker
can deduce the PDE virtual address for the vDSO memory page. This attack
was demonstrated at CanSecWest 2016, in the "Getting Physical: Extreme
Abuse of Intel Based Paged Systems" https://goo.gl/ANpWdV (see second part
of the presentation). The exploits used against Linux worked successfully
against 4.6+ but fail with KASLR memory enabled (https://goo.gl/iTtXMJ).
Similar research was done at Google leading to this patch proposal.
Variants exists to overwrite /proc or /sys objects ACLs leading to
elevation of privileges. These variants were tested against 4.6+.

The page offset used by the compressed kernel retains the static value
since it is not yet randomized during this boot stage.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/boot/compressed/pagetable.c |  3 +++
 arch/x86/include/asm/kaslr.h         |  2 ++
 arch/x86/include/asm/page_64_types.h | 11 ++++++++++-
 arch/x86/kernel/head_64.S            |  2 +-
 arch/x86/mm/kaslr.c                  | 18 +++++++++++++++---
 5 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/arch/x86/boot/compressed/pagetable.c b/arch/x86/boot/compressed/pagetable.c
index 6e31a6aac4d3..56589d0a804b 100644
--- a/arch/x86/boot/compressed/pagetable.c
+++ b/arch/x86/boot/compressed/pagetable.c
@@ -20,6 +20,9 @@
 /* These actually do the work of building the kernel identity maps. */
 #include <asm/init.h>
 #include <asm/pgtable.h>
+/* Use the static base for this part of the boot process */
+#undef __PAGE_OFFSET
+#define __PAGE_OFFSET __PAGE_OFFSET_BASE
 #include "../../mm/ident_map.c"
 
 /* Used by pgtable.h asm code to force instruction serialization. */
diff --git a/arch/x86/include/asm/kaslr.h b/arch/x86/include/asm/kaslr.h
index 683c9d736314..62b1b815a83a 100644
--- a/arch/x86/include/asm/kaslr.h
+++ b/arch/x86/include/asm/kaslr.h
@@ -4,6 +4,8 @@
 unsigned long kaslr_get_random_long(const char *purpose);
 
 #ifdef CONFIG_RANDOMIZE_MEMORY
+extern unsigned long page_offset_base;
+
 void kernel_randomize_memory(void);
 #else
 static inline void kernel_randomize_memory(void) { }
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index d5c2f8b40faa..9215e0527647 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -1,6 +1,10 @@
 #ifndef _ASM_X86_PAGE_64_DEFS_H
 #define _ASM_X86_PAGE_64_DEFS_H
 
+#ifndef __ASSEMBLY__
+#include <asm/kaslr.h>
+#endif
+
 #ifdef CONFIG_KASAN
 #define KASAN_STACK_ORDER 1
 #else
@@ -32,7 +36,12 @@
  * hypervisor to fit.  Choosing 16 slots here is arbitrary, but it's
  * what Xen requires.
  */
-#define __PAGE_OFFSET           _AC(0xffff880000000000, UL)
+#define __PAGE_OFFSET_BASE      _AC(0xffff880000000000, UL)
+#ifdef CONFIG_RANDOMIZE_MEMORY
+#define __PAGE_OFFSET           page_offset_base
+#else
+#define __PAGE_OFFSET           __PAGE_OFFSET_BASE
+#endif /* CONFIG_RANDOMIZE_MEMORY */
 
 #define __START_KERNEL_map	_AC(0xffffffff80000000, UL)
 
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 5df831ef1442..03a2aa067ff3 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -38,7 +38,7 @@
 
 #define pud_index(x)	(((x) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
 
-L4_PAGE_OFFSET = pgd_index(__PAGE_OFFSET)
+L4_PAGE_OFFSET = pgd_index(__PAGE_OFFSET_BASE)
 L4_START_KERNEL = pgd_index(__START_KERNEL_map)
 L3_START_KERNEL = pud_index(__START_KERNEL_map)
 
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index d5380a48e8fb..609ecf2b37ed 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -43,8 +43,12 @@
  * before. You also need to add a BUILD_BUG_ON in kernel_randomize_memory to
  * ensure that this order is correct and won't be changed.
  */
-static const unsigned long vaddr_start;
-static const unsigned long vaddr_end;
+static const unsigned long vaddr_start = __PAGE_OFFSET_BASE;
+static const unsigned long vaddr_end = VMALLOC_START;
+
+/* Default values */
+unsigned long page_offset_base = __PAGE_OFFSET_BASE;
+EXPORT_SYMBOL(page_offset_base);
 
 /*
  * Memory regions randomized by KASLR (except modules that use a separate logic
@@ -55,6 +59,7 @@ static __initdata struct kaslr_memory_region {
 	unsigned long *base;
 	unsigned long size_tb;
 } kaslr_regions[] = {
+	{ &page_offset_base, 64/* Maximum */ },
 };
 
 /* Get size in bytes used by the memory region */
@@ -77,13 +82,20 @@ void __init kernel_randomize_memory(void)
 {
 	size_t i;
 	unsigned long vaddr = vaddr_start;
-	unsigned long rand;
+	unsigned long rand, memory_tb;
 	struct rnd_state rand_state;
 	unsigned long remain_entropy;
 
 	if (!kaslr_memory_enabled())
 		return;
 
+	BUG_ON(kaslr_regions[0].base != &page_offset_base);
+	memory_tb = ((max_pfn << PAGE_SHIFT) >> TB_SHIFT);
+
+	/* Adapt phyiscal memory region size based on available memory */
+	if (memory_tb < kaslr_regions[0].size_tb)
+		kaslr_regions[0].size_tb = memory_tb;
+
 	/* Calculate entropy available between regions */
 	remain_entropy = vaddr_end - vaddr_start;
 	for (i = 0; i < ARRAY_SIZE(kaslr_regions); i++)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [kernel-hardening] [PATCH v7 6/9] x86/mm: Enable KASLR for physical mapping memory region (x86_64)
@ 2016-06-22  0:47   ` Kees Cook
  0 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

From: Thomas Garnier <thgarnie@google.com>

Add the physical mapping in the list of randomized memory regions.

The physical memory mapping holds most allocations from boot and heap
allocators. Knowing the base address and physical memory size, an attacker
can deduce the PDE virtual address for the vDSO memory page. This attack
was demonstrated at CanSecWest 2016, in the "Getting Physical: Extreme
Abuse of Intel Based Paged Systems" https://goo.gl/ANpWdV (see second part
of the presentation). The exploits used against Linux worked successfully
against 4.6+ but fail with KASLR memory enabled (https://goo.gl/iTtXMJ).
Similar research was done at Google leading to this patch proposal.
Variants exists to overwrite /proc or /sys objects ACLs leading to
elevation of privileges. These variants were tested against 4.6+.

The page offset used by the compressed kernel retains the static value
since it is not yet randomized during this boot stage.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/boot/compressed/pagetable.c |  3 +++
 arch/x86/include/asm/kaslr.h         |  2 ++
 arch/x86/include/asm/page_64_types.h | 11 ++++++++++-
 arch/x86/kernel/head_64.S            |  2 +-
 arch/x86/mm/kaslr.c                  | 18 +++++++++++++++---
 5 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/arch/x86/boot/compressed/pagetable.c b/arch/x86/boot/compressed/pagetable.c
index 6e31a6aac4d3..56589d0a804b 100644
--- a/arch/x86/boot/compressed/pagetable.c
+++ b/arch/x86/boot/compressed/pagetable.c
@@ -20,6 +20,9 @@
 /* These actually do the work of building the kernel identity maps. */
 #include <asm/init.h>
 #include <asm/pgtable.h>
+/* Use the static base for this part of the boot process */
+#undef __PAGE_OFFSET
+#define __PAGE_OFFSET __PAGE_OFFSET_BASE
 #include "../../mm/ident_map.c"
 
 /* Used by pgtable.h asm code to force instruction serialization. */
diff --git a/arch/x86/include/asm/kaslr.h b/arch/x86/include/asm/kaslr.h
index 683c9d736314..62b1b815a83a 100644
--- a/arch/x86/include/asm/kaslr.h
+++ b/arch/x86/include/asm/kaslr.h
@@ -4,6 +4,8 @@
 unsigned long kaslr_get_random_long(const char *purpose);
 
 #ifdef CONFIG_RANDOMIZE_MEMORY
+extern unsigned long page_offset_base;
+
 void kernel_randomize_memory(void);
 #else
 static inline void kernel_randomize_memory(void) { }
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index d5c2f8b40faa..9215e0527647 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -1,6 +1,10 @@
 #ifndef _ASM_X86_PAGE_64_DEFS_H
 #define _ASM_X86_PAGE_64_DEFS_H
 
+#ifndef __ASSEMBLY__
+#include <asm/kaslr.h>
+#endif
+
 #ifdef CONFIG_KASAN
 #define KASAN_STACK_ORDER 1
 #else
@@ -32,7 +36,12 @@
  * hypervisor to fit.  Choosing 16 slots here is arbitrary, but it's
  * what Xen requires.
  */
-#define __PAGE_OFFSET           _AC(0xffff880000000000, UL)
+#define __PAGE_OFFSET_BASE      _AC(0xffff880000000000, UL)
+#ifdef CONFIG_RANDOMIZE_MEMORY
+#define __PAGE_OFFSET           page_offset_base
+#else
+#define __PAGE_OFFSET           __PAGE_OFFSET_BASE
+#endif /* CONFIG_RANDOMIZE_MEMORY */
 
 #define __START_KERNEL_map	_AC(0xffffffff80000000, UL)
 
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 5df831ef1442..03a2aa067ff3 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -38,7 +38,7 @@
 
 #define pud_index(x)	(((x) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
 
-L4_PAGE_OFFSET = pgd_index(__PAGE_OFFSET)
+L4_PAGE_OFFSET = pgd_index(__PAGE_OFFSET_BASE)
 L4_START_KERNEL = pgd_index(__START_KERNEL_map)
 L3_START_KERNEL = pud_index(__START_KERNEL_map)
 
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index d5380a48e8fb..609ecf2b37ed 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -43,8 +43,12 @@
  * before. You also need to add a BUILD_BUG_ON in kernel_randomize_memory to
  * ensure that this order is correct and won't be changed.
  */
-static const unsigned long vaddr_start;
-static const unsigned long vaddr_end;
+static const unsigned long vaddr_start = __PAGE_OFFSET_BASE;
+static const unsigned long vaddr_end = VMALLOC_START;
+
+/* Default values */
+unsigned long page_offset_base = __PAGE_OFFSET_BASE;
+EXPORT_SYMBOL(page_offset_base);
 
 /*
  * Memory regions randomized by KASLR (except modules that use a separate logic
@@ -55,6 +59,7 @@ static __initdata struct kaslr_memory_region {
 	unsigned long *base;
 	unsigned long size_tb;
 } kaslr_regions[] = {
+	{ &page_offset_base, 64/* Maximum */ },
 };
 
 /* Get size in bytes used by the memory region */
@@ -77,13 +82,20 @@ void __init kernel_randomize_memory(void)
 {
 	size_t i;
 	unsigned long vaddr = vaddr_start;
-	unsigned long rand;
+	unsigned long rand, memory_tb;
 	struct rnd_state rand_state;
 	unsigned long remain_entropy;
 
 	if (!kaslr_memory_enabled())
 		return;
 
+	BUG_ON(kaslr_regions[0].base != &page_offset_base);
+	memory_tb = ((max_pfn << PAGE_SHIFT) >> TB_SHIFT);
+
+	/* Adapt phyiscal memory region size based on available memory */
+	if (memory_tb < kaslr_regions[0].size_tb)
+		kaslr_regions[0].size_tb = memory_tb;
+
 	/* Calculate entropy available between regions */
 	remain_entropy = vaddr_end - vaddr_start;
 	for (i = 0; i < ARRAY_SIZE(kaslr_regions); i++)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v7 7/9] x86/mm: Enable KASLR for vmalloc memory region (x86_64)
  2016-06-22  0:46 ` [kernel-hardening] " Kees Cook
@ 2016-06-22  0:47   ` Kees Cook
  -1 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

From: Thomas Garnier <thgarnie@google.com>

Add vmalloc in the list of randomized memory regions.

The vmalloc memory region contains the allocation made through the vmalloc
API. The allocations are done sequentially to prevent fragmentation and
each allocation address can easily be deduced especially from boot.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/include/asm/kaslr.h            |  1 +
 arch/x86/include/asm/pgtable_64_types.h | 15 +++++++++++----
 arch/x86/mm/kaslr.c                     |  5 ++++-
 3 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kaslr.h b/arch/x86/include/asm/kaslr.h
index 62b1b815a83a..2674ee3de748 100644
--- a/arch/x86/include/asm/kaslr.h
+++ b/arch/x86/include/asm/kaslr.h
@@ -5,6 +5,7 @@ unsigned long kaslr_get_random_long(const char *purpose);
 
 #ifdef CONFIG_RANDOMIZE_MEMORY
 extern unsigned long page_offset_base;
+extern unsigned long vmalloc_base;
 
 void kernel_randomize_memory(void);
 #else
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index e6844dfb4471..6fdef9eef2d5 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -5,6 +5,7 @@
 
 #ifndef __ASSEMBLY__
 #include <linux/types.h>
+#include <asm/kaslr.h>
 
 /*
  * These are used to make use of C type-checking..
@@ -53,10 +54,16 @@ typedef struct { pteval_t pte; } pte_t;
 #define PGDIR_MASK	(~(PGDIR_SIZE - 1))
 
 /* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */
-#define MAXMEM		 _AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL)
-#define VMALLOC_START    _AC(0xffffc90000000000, UL)
-#define VMALLOC_END      _AC(0xffffe8ffffffffff, UL)
-#define VMEMMAP_START	 _AC(0xffffea0000000000, UL)
+#define MAXMEM		_AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL)
+#define VMALLOC_SIZE_TB	_AC(32, UL)
+#define __VMALLOC_BASE	_AC(0xffffc90000000000, UL)
+#define VMEMMAP_START	_AC(0xffffea0000000000, UL)
+#ifdef CONFIG_RANDOMIZE_MEMORY
+#define VMALLOC_START	vmalloc_base
+#else
+#define VMALLOC_START	__VMALLOC_BASE
+#endif /* CONFIG_RANDOMIZE_MEMORY */
+#define VMALLOC_END	(VMALLOC_START + _AC((VMALLOC_SIZE_TB << 40) - 1, UL))
 #define MODULES_VADDR    (__START_KERNEL_map + KERNEL_IMAGE_SIZE)
 #define MODULES_END      _AC(0xffffffffff000000, UL)
 #define MODULES_LEN   (MODULES_END - MODULES_VADDR)
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 609ecf2b37ed..c939cfe1b516 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -44,11 +44,13 @@
  * ensure that this order is correct and won't be changed.
  */
 static const unsigned long vaddr_start = __PAGE_OFFSET_BASE;
-static const unsigned long vaddr_end = VMALLOC_START;
+static const unsigned long vaddr_end = VMEMMAP_START;
 
 /* Default values */
 unsigned long page_offset_base = __PAGE_OFFSET_BASE;
 EXPORT_SYMBOL(page_offset_base);
+unsigned long vmalloc_base = __VMALLOC_BASE;
+EXPORT_SYMBOL(vmalloc_base);
 
 /*
  * Memory regions randomized by KASLR (except modules that use a separate logic
@@ -60,6 +62,7 @@ static __initdata struct kaslr_memory_region {
 	unsigned long size_tb;
 } kaslr_regions[] = {
 	{ &page_offset_base, 64/* Maximum */ },
+	{ &vmalloc_base, VMALLOC_SIZE_TB },
 };
 
 /* Get size in bytes used by the memory region */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [kernel-hardening] [PATCH v7 7/9] x86/mm: Enable KASLR for vmalloc memory region (x86_64)
@ 2016-06-22  0:47   ` Kees Cook
  0 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

From: Thomas Garnier <thgarnie@google.com>

Add vmalloc in the list of randomized memory regions.

The vmalloc memory region contains the allocation made through the vmalloc
API. The allocations are done sequentially to prevent fragmentation and
each allocation address can easily be deduced especially from boot.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/include/asm/kaslr.h            |  1 +
 arch/x86/include/asm/pgtable_64_types.h | 15 +++++++++++----
 arch/x86/mm/kaslr.c                     |  5 ++++-
 3 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kaslr.h b/arch/x86/include/asm/kaslr.h
index 62b1b815a83a..2674ee3de748 100644
--- a/arch/x86/include/asm/kaslr.h
+++ b/arch/x86/include/asm/kaslr.h
@@ -5,6 +5,7 @@ unsigned long kaslr_get_random_long(const char *purpose);
 
 #ifdef CONFIG_RANDOMIZE_MEMORY
 extern unsigned long page_offset_base;
+extern unsigned long vmalloc_base;
 
 void kernel_randomize_memory(void);
 #else
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index e6844dfb4471..6fdef9eef2d5 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -5,6 +5,7 @@
 
 #ifndef __ASSEMBLY__
 #include <linux/types.h>
+#include <asm/kaslr.h>
 
 /*
  * These are used to make use of C type-checking..
@@ -53,10 +54,16 @@ typedef struct { pteval_t pte; } pte_t;
 #define PGDIR_MASK	(~(PGDIR_SIZE - 1))
 
 /* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */
-#define MAXMEM		 _AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL)
-#define VMALLOC_START    _AC(0xffffc90000000000, UL)
-#define VMALLOC_END      _AC(0xffffe8ffffffffff, UL)
-#define VMEMMAP_START	 _AC(0xffffea0000000000, UL)
+#define MAXMEM		_AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL)
+#define VMALLOC_SIZE_TB	_AC(32, UL)
+#define __VMALLOC_BASE	_AC(0xffffc90000000000, UL)
+#define VMEMMAP_START	_AC(0xffffea0000000000, UL)
+#ifdef CONFIG_RANDOMIZE_MEMORY
+#define VMALLOC_START	vmalloc_base
+#else
+#define VMALLOC_START	__VMALLOC_BASE
+#endif /* CONFIG_RANDOMIZE_MEMORY */
+#define VMALLOC_END	(VMALLOC_START + _AC((VMALLOC_SIZE_TB << 40) - 1, UL))
 #define MODULES_VADDR    (__START_KERNEL_map + KERNEL_IMAGE_SIZE)
 #define MODULES_END      _AC(0xffffffffff000000, UL)
 #define MODULES_LEN   (MODULES_END - MODULES_VADDR)
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 609ecf2b37ed..c939cfe1b516 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -44,11 +44,13 @@
  * ensure that this order is correct and won't be changed.
  */
 static const unsigned long vaddr_start = __PAGE_OFFSET_BASE;
-static const unsigned long vaddr_end = VMALLOC_START;
+static const unsigned long vaddr_end = VMEMMAP_START;
 
 /* Default values */
 unsigned long page_offset_base = __PAGE_OFFSET_BASE;
 EXPORT_SYMBOL(page_offset_base);
+unsigned long vmalloc_base = __VMALLOC_BASE;
+EXPORT_SYMBOL(vmalloc_base);
 
 /*
  * Memory regions randomized by KASLR (except modules that use a separate logic
@@ -60,6 +62,7 @@ static __initdata struct kaslr_memory_region {
 	unsigned long size_tb;
 } kaslr_regions[] = {
 	{ &page_offset_base, 64/* Maximum */ },
+	{ &vmalloc_base, VMALLOC_SIZE_TB },
 };
 
 /* Get size in bytes used by the memory region */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v7 8/9] x86/mm: Enable KASLR for vmemmap memory region (x86_64)
  2016-06-22  0:46 ` [kernel-hardening] " Kees Cook
@ 2016-06-22  0:47   ` Kees Cook
  -1 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

From: Thomas Garnier <thgarnie@google.com>

Add vmemmap in the list of randomized memory regions.

The vmemmap region holds a representation of the physical memory (through
a struct page array). An attacker could use this region to disclose the
kernel memory layout (walking the page linked list).

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/include/asm/kaslr.h            |  1 +
 arch/x86/include/asm/pgtable_64_types.h |  4 +++-
 arch/x86/mm/kaslr.c                     | 24 +++++++++++++++++++++++-
 3 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kaslr.h b/arch/x86/include/asm/kaslr.h
index 2674ee3de748..1052a797d71d 100644
--- a/arch/x86/include/asm/kaslr.h
+++ b/arch/x86/include/asm/kaslr.h
@@ -6,6 +6,7 @@ unsigned long kaslr_get_random_long(const char *purpose);
 #ifdef CONFIG_RANDOMIZE_MEMORY
 extern unsigned long page_offset_base;
 extern unsigned long vmalloc_base;
+extern unsigned long vmemmap_base;
 
 void kernel_randomize_memory(void);
 #else
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 6fdef9eef2d5..3a264200c62f 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -57,11 +57,13 @@ typedef struct { pteval_t pte; } pte_t;
 #define MAXMEM		_AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL)
 #define VMALLOC_SIZE_TB	_AC(32, UL)
 #define __VMALLOC_BASE	_AC(0xffffc90000000000, UL)
-#define VMEMMAP_START	_AC(0xffffea0000000000, UL)
+#define __VMEMMAP_BASE	_AC(0xffffea0000000000, UL)
 #ifdef CONFIG_RANDOMIZE_MEMORY
 #define VMALLOC_START	vmalloc_base
+#define VMEMMAP_START	vmemmap_base
 #else
 #define VMALLOC_START	__VMALLOC_BASE
+#define VMEMMAP_START	__VMEMMAP_BASE
 #endif /* CONFIG_RANDOMIZE_MEMORY */
 #define VMALLOC_END	(VMALLOC_START + _AC((VMALLOC_SIZE_TB << 40) - 1, UL))
 #define MODULES_VADDR    (__START_KERNEL_map + KERNEL_IMAGE_SIZE)
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index c939cfe1b516..4f91dc273062 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -44,13 +44,22 @@
  * ensure that this order is correct and won't be changed.
  */
 static const unsigned long vaddr_start = __PAGE_OFFSET_BASE;
-static const unsigned long vaddr_end = VMEMMAP_START;
+
+#if defined(CONFIG_X86_ESPFIX64)
+static const unsigned long vaddr_end = ESPFIX_BASE_ADDR;
+#elif defined(CONFIG_EFI)
+static const unsigned long vaddr_end = EFI_VA_START;
+#else
+static const unsigned long vaddr_end = __START_KERNEL_map;
+#endif
 
 /* Default values */
 unsigned long page_offset_base = __PAGE_OFFSET_BASE;
 EXPORT_SYMBOL(page_offset_base);
 unsigned long vmalloc_base = __VMALLOC_BASE;
 EXPORT_SYMBOL(vmalloc_base);
+unsigned long vmemmap_base = __VMEMMAP_BASE;
+EXPORT_SYMBOL(vmemmap_base);
 
 /*
  * Memory regions randomized by KASLR (except modules that use a separate logic
@@ -63,6 +72,7 @@ static __initdata struct kaslr_memory_region {
 } kaslr_regions[] = {
 	{ &page_offset_base, 64/* Maximum */ },
 	{ &vmalloc_base, VMALLOC_SIZE_TB },
+	{ &vmemmap_base, 1 },
 };
 
 /* Get size in bytes used by the memory region */
@@ -89,6 +99,18 @@ void __init kernel_randomize_memory(void)
 	struct rnd_state rand_state;
 	unsigned long remain_entropy;
 
+	/*
+	 * All these BUILD_BUG_ON checks ensures the memory layout is
+	 * consistent with the vaddr_start/vaddr_end variables.
+	 */
+	BUILD_BUG_ON(vaddr_start >= vaddr_end);
+	BUILD_BUG_ON(config_enabled(CONFIG_X86_ESPFIX64) &&
+		     vaddr_end >= EFI_VA_START);
+	BUILD_BUG_ON((config_enabled(CONFIG_X86_ESPFIX64) ||
+		      config_enabled(CONFIG_EFI)) &&
+		     vaddr_end >= __START_KERNEL_map);
+	BUILD_BUG_ON(vaddr_end > __START_KERNEL_map);
+
 	if (!kaslr_memory_enabled())
 		return;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [kernel-hardening] [PATCH v7 8/9] x86/mm: Enable KASLR for vmemmap memory region (x86_64)
@ 2016-06-22  0:47   ` Kees Cook
  0 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

From: Thomas Garnier <thgarnie@google.com>

Add vmemmap in the list of randomized memory regions.

The vmemmap region holds a representation of the physical memory (through
a struct page array). An attacker could use this region to disclose the
kernel memory layout (walking the page linked list).

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/include/asm/kaslr.h            |  1 +
 arch/x86/include/asm/pgtable_64_types.h |  4 +++-
 arch/x86/mm/kaslr.c                     | 24 +++++++++++++++++++++++-
 3 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kaslr.h b/arch/x86/include/asm/kaslr.h
index 2674ee3de748..1052a797d71d 100644
--- a/arch/x86/include/asm/kaslr.h
+++ b/arch/x86/include/asm/kaslr.h
@@ -6,6 +6,7 @@ unsigned long kaslr_get_random_long(const char *purpose);
 #ifdef CONFIG_RANDOMIZE_MEMORY
 extern unsigned long page_offset_base;
 extern unsigned long vmalloc_base;
+extern unsigned long vmemmap_base;
 
 void kernel_randomize_memory(void);
 #else
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 6fdef9eef2d5..3a264200c62f 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -57,11 +57,13 @@ typedef struct { pteval_t pte; } pte_t;
 #define MAXMEM		_AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL)
 #define VMALLOC_SIZE_TB	_AC(32, UL)
 #define __VMALLOC_BASE	_AC(0xffffc90000000000, UL)
-#define VMEMMAP_START	_AC(0xffffea0000000000, UL)
+#define __VMEMMAP_BASE	_AC(0xffffea0000000000, UL)
 #ifdef CONFIG_RANDOMIZE_MEMORY
 #define VMALLOC_START	vmalloc_base
+#define VMEMMAP_START	vmemmap_base
 #else
 #define VMALLOC_START	__VMALLOC_BASE
+#define VMEMMAP_START	__VMEMMAP_BASE
 #endif /* CONFIG_RANDOMIZE_MEMORY */
 #define VMALLOC_END	(VMALLOC_START + _AC((VMALLOC_SIZE_TB << 40) - 1, UL))
 #define MODULES_VADDR    (__START_KERNEL_map + KERNEL_IMAGE_SIZE)
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index c939cfe1b516..4f91dc273062 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -44,13 +44,22 @@
  * ensure that this order is correct and won't be changed.
  */
 static const unsigned long vaddr_start = __PAGE_OFFSET_BASE;
-static const unsigned long vaddr_end = VMEMMAP_START;
+
+#if defined(CONFIG_X86_ESPFIX64)
+static const unsigned long vaddr_end = ESPFIX_BASE_ADDR;
+#elif defined(CONFIG_EFI)
+static const unsigned long vaddr_end = EFI_VA_START;
+#else
+static const unsigned long vaddr_end = __START_KERNEL_map;
+#endif
 
 /* Default values */
 unsigned long page_offset_base = __PAGE_OFFSET_BASE;
 EXPORT_SYMBOL(page_offset_base);
 unsigned long vmalloc_base = __VMALLOC_BASE;
 EXPORT_SYMBOL(vmalloc_base);
+unsigned long vmemmap_base = __VMEMMAP_BASE;
+EXPORT_SYMBOL(vmemmap_base);
 
 /*
  * Memory regions randomized by KASLR (except modules that use a separate logic
@@ -63,6 +72,7 @@ static __initdata struct kaslr_memory_region {
 } kaslr_regions[] = {
 	{ &page_offset_base, 64/* Maximum */ },
 	{ &vmalloc_base, VMALLOC_SIZE_TB },
+	{ &vmemmap_base, 1 },
 };
 
 /* Get size in bytes used by the memory region */
@@ -89,6 +99,18 @@ void __init kernel_randomize_memory(void)
 	struct rnd_state rand_state;
 	unsigned long remain_entropy;
 
+	/*
+	 * All these BUILD_BUG_ON checks ensures the memory layout is
+	 * consistent with the vaddr_start/vaddr_end variables.
+	 */
+	BUILD_BUG_ON(vaddr_start >= vaddr_end);
+	BUILD_BUG_ON(config_enabled(CONFIG_X86_ESPFIX64) &&
+		     vaddr_end >= EFI_VA_START);
+	BUILD_BUG_ON((config_enabled(CONFIG_X86_ESPFIX64) ||
+		      config_enabled(CONFIG_EFI)) &&
+		     vaddr_end >= __START_KERNEL_map);
+	BUILD_BUG_ON(vaddr_end > __START_KERNEL_map);
+
 	if (!kaslr_memory_enabled())
 		return;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v7 9/9] x86/mm: Memory hotplug support for KASLR memory randomization (x86_64)
  2016-06-22  0:46 ` [kernel-hardening] " Kees Cook
@ 2016-06-22  0:47   ` Kees Cook
  -1 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

From: Thomas Garnier <thgarnie@google.com>

Add a new option (CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING) to define
the padding used for the physical memory mapping section when KASLR
memory is enabled. It ensures there is enough virtual address space when
CONFIG_MEMORY_HOTPLUG is used. The default value is 10 terabytes. If
CONFIG_MEMORY_HOTPLUG is not used, no space is reserved increasing the
entropy available.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/Kconfig    | 15 +++++++++++++++
 arch/x86/mm/kaslr.c |  7 ++++++-
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index adab3fef3bb4..214b3fadbc11 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2010,6 +2010,21 @@ config RANDOMIZE_MEMORY
 
 	   If unsure, say N.
 
+config RANDOMIZE_MEMORY_PHYSICAL_PADDING
+	hex "Physical memory mapping padding" if EXPERT
+	depends on RANDOMIZE_MEMORY
+	default "0xa" if MEMORY_HOTPLUG
+	default "0x0"
+	range 0x1 0x40 if MEMORY_HOTPLUG
+	range 0x0 0x40
+	---help---
+	   Define the padding in terabytes added to the existing physical
+	   memory size during kernel memory randomization. It is useful
+	   for memory hotplug support but reduces the entropy available for
+	   address randomization.
+
+	   If unsure, leave at the default value.
+
 config HOTPLUG_CPU
 	bool "Support for hot-pluggable CPUs"
 	depends on SMP
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 4f91dc273062..3e9875f7fdda 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -114,8 +114,13 @@ void __init kernel_randomize_memory(void)
 	if (!kaslr_memory_enabled())
 		return;
 
+	/*
+	 * Update Physical memory mapping to available and
+	 * add padding if needed (especially for memory hotplug support).
+	 */
 	BUG_ON(kaslr_regions[0].base != &page_offset_base);
-	memory_tb = ((max_pfn << PAGE_SHIFT) >> TB_SHIFT);
+	memory_tb = ((max_pfn << PAGE_SHIFT) >> TB_SHIFT) +
+		CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
 
 	/* Adapt phyiscal memory region size based on available memory */
 	if (memory_tb < kaslr_regions[0].size_tb)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [kernel-hardening] [PATCH v7 9/9] x86/mm: Memory hotplug support for KASLR memory randomization (x86_64)
@ 2016-06-22  0:47   ` Kees Cook
  0 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22  0:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich,
	linux-kernel, Jonathan Corbet, linux-doc, kernel-hardening

From: Thomas Garnier <thgarnie@google.com>

Add a new option (CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING) to define
the padding used for the physical memory mapping section when KASLR
memory is enabled. It ensures there is enough virtual address space when
CONFIG_MEMORY_HOTPLUG is used. The default value is 10 terabytes. If
CONFIG_MEMORY_HOTPLUG is not used, no space is reserved increasing the
entropy available.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/Kconfig    | 15 +++++++++++++++
 arch/x86/mm/kaslr.c |  7 ++++++-
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index adab3fef3bb4..214b3fadbc11 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2010,6 +2010,21 @@ config RANDOMIZE_MEMORY
 
 	   If unsure, say N.
 
+config RANDOMIZE_MEMORY_PHYSICAL_PADDING
+	hex "Physical memory mapping padding" if EXPERT
+	depends on RANDOMIZE_MEMORY
+	default "0xa" if MEMORY_HOTPLUG
+	default "0x0"
+	range 0x1 0x40 if MEMORY_HOTPLUG
+	range 0x0 0x40
+	---help---
+	   Define the padding in terabytes added to the existing physical
+	   memory size during kernel memory randomization. It is useful
+	   for memory hotplug support but reduces the entropy available for
+	   address randomization.
+
+	   If unsure, leave at the default value.
+
 config HOTPLUG_CPU
 	bool "Support for hot-pluggable CPUs"
 	depends on SMP
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 4f91dc273062..3e9875f7fdda 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -114,8 +114,13 @@ void __init kernel_randomize_memory(void)
 	if (!kaslr_memory_enabled())
 		return;
 
+	/*
+	 * Update Physical memory mapping to available and
+	 * add padding if needed (especially for memory hotplug support).
+	 */
 	BUG_ON(kaslr_regions[0].base != &page_offset_base);
-	memory_tb = ((max_pfn << PAGE_SHIFT) >> TB_SHIFT);
+	memory_tb = ((max_pfn << PAGE_SHIFT) >> TB_SHIFT) +
+		CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
 
 	/* Adapt phyiscal memory region size based on available memory */
 	if (memory_tb < kaslr_regions[0].size_tb)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
  2016-06-22  0:46 ` [kernel-hardening] " Kees Cook
                   ` (9 preceding siblings ...)
  (?)
@ 2016-06-22 12:47 ` Jason Cooper
  2016-06-22 15:59   ` Thomas Garnier
  -1 siblings, 1 reply; 74+ messages in thread
From: Jason Cooper @ 2016-06-22 12:47 UTC (permalink / raw)
  To: kernel-hardening
  Cc: Ingo Molnar, Kees Cook, Thomas Garnier, Andy Lutomirski, x86,
	Borislav Petkov, Baoquan He, Yinghai Lu, Juergen Gross,
	Matt Fleming, Toshi Kani, Andrew Morton, Dan Williams,
	Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, linux-kernel,
	Jonathan Corbet, linux-doc

Hey Kees,

On Tue, Jun 21, 2016 at 05:46:57PM -0700, Kees Cook wrote:
> Notable problems that needed solving:
...
>  - Reasonable entropy is needed early at boot before get_random_bytes()
>    is available.

This series is targetting x86, which typically has RDRAND/RDSEED
instructions.  Are you referring to other arches?  Older x86?  Also,
isn't this the same requirement for base address KASLR?

Don't get me wrong, I want more diverse entropy sources available
earlier in the boot process as well. :-)  I'm just wondering what's
different about this series vs base address KASLR wrt early entropy
sources.

thx,

Jason.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
  2016-06-22 12:47 ` [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR Jason Cooper
@ 2016-06-22 15:59   ` Thomas Garnier
  2016-06-22 17:05       ` Kees Cook
  0 siblings, 1 reply; 74+ messages in thread
From: Thomas Garnier @ 2016-06-22 15:59 UTC (permalink / raw)
  To: Jason Cooper
  Cc: kernel-hardening, Ingo Molnar, Kees Cook, Andy Lutomirski, x86,
	Borislav Petkov, Baoquan He, Yinghai Lu, Juergen Gross,
	Matt Fleming, Toshi Kani, Andrew Morton, Dan Williams,
	Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

On Wed, Jun 22, 2016 at 5:47 AM, Jason Cooper <jason@lakedaemon.net> wrote:
> Hey Kees,
>
> On Tue, Jun 21, 2016 at 05:46:57PM -0700, Kees Cook wrote:
>> Notable problems that needed solving:
> ...
>>  - Reasonable entropy is needed early at boot before get_random_bytes()
>>    is available.
>
> This series is targetting x86, which typically has RDRAND/RDSEED
> instructions.  Are you referring to other arches?  Older x86?  Also,
> isn't this the same requirement for base address KASLR?
>
> Don't get me wrong, I want more diverse entropy sources available
> earlier in the boot process as well. :-)  I'm just wondering what's
> different about this series vs base address KASLR wrt early entropy
> sources.
>

I think Kees was referring to the refactor I did to get the similar
entropy generation than KASLR module randomization. Our approach was
to provide best entropy possible even if you have an older processor
or under virtualization without support for these instructions.
Unfortunately common on companies with a large number of older
machines.

> thx,
>
> Jason.

Thanks,
Thomas

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
  2016-06-22 15:59   ` Thomas Garnier
@ 2016-06-22 17:05       ` Kees Cook
  0 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22 17:05 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: Jason Cooper, kernel-hardening, Ingo Molnar, Andy Lutomirski,
	x86, Borislav Petkov, Baoquan He, Yinghai Lu, Juergen Gross,
	Matt Fleming, Toshi Kani, Andrew Morton, Dan Williams,
	Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

On Wed, Jun 22, 2016 at 8:59 AM, Thomas Garnier <thgarnie@google.com> wrote:
> On Wed, Jun 22, 2016 at 5:47 AM, Jason Cooper <jason@lakedaemon.net> wrote:
>> Hey Kees,
>>
>> On Tue, Jun 21, 2016 at 05:46:57PM -0700, Kees Cook wrote:
>>> Notable problems that needed solving:
>> ...
>>>  - Reasonable entropy is needed early at boot before get_random_bytes()
>>>    is available.
>>
>> This series is targetting x86, which typically has RDRAND/RDSEED
>> instructions.  Are you referring to other arches?  Older x86?  Also,
>> isn't this the same requirement for base address KASLR?
>>
>> Don't get me wrong, I want more diverse entropy sources available
>> earlier in the boot process as well. :-)  I'm just wondering what's
>> different about this series vs base address KASLR wrt early entropy
>> sources.
>>
>
> I think Kees was referring to the refactor I did to get the similar
> entropy generation than KASLR module randomization. Our approach was
> to provide best entropy possible even if you have an older processor
> or under virtualization without support for these instructions.
> Unfortunately common on companies with a large number of older
> machines.

Right, the memory offset KASLR uses the same routines as the kernel
base KASLR. The issue is with older x86 systems, which continue to be
very common.

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
@ 2016-06-22 17:05       ` Kees Cook
  0 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-22 17:05 UTC (permalink / raw)
  To: Thomas Garnier
  Cc: Jason Cooper, kernel-hardening, Ingo Molnar, Andy Lutomirski,
	x86, Borislav Petkov, Baoquan He, Yinghai Lu, Juergen Gross,
	Matt Fleming, Toshi Kani, Andrew Morton, Dan Williams,
	Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

On Wed, Jun 22, 2016 at 8:59 AM, Thomas Garnier <thgarnie@google.com> wrote:
> On Wed, Jun 22, 2016 at 5:47 AM, Jason Cooper <jason@lakedaemon.net> wrote:
>> Hey Kees,
>>
>> On Tue, Jun 21, 2016 at 05:46:57PM -0700, Kees Cook wrote:
>>> Notable problems that needed solving:
>> ...
>>>  - Reasonable entropy is needed early at boot before get_random_bytes()
>>>    is available.
>>
>> This series is targetting x86, which typically has RDRAND/RDSEED
>> instructions.  Are you referring to other arches?  Older x86?  Also,
>> isn't this the same requirement for base address KASLR?
>>
>> Don't get me wrong, I want more diverse entropy sources available
>> earlier in the boot process as well. :-)  I'm just wondering what's
>> different about this series vs base address KASLR wrt early entropy
>> sources.
>>
>
> I think Kees was referring to the refactor I did to get the similar
> entropy generation than KASLR module randomization. Our approach was
> to provide best entropy possible even if you have an older processor
> or under virtualization without support for these instructions.
> Unfortunately common on companies with a large number of older
> machines.

Right, the memory offset KASLR uses the same routines as the kernel
base KASLR. The issue is with older x86 systems, which continue to be
very common.

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
  2016-06-22 17:05       ` Kees Cook
@ 2016-06-23 19:33         ` Jason Cooper
  -1 siblings, 0 replies; 74+ messages in thread
From: Jason Cooper @ 2016-06-23 19:33 UTC (permalink / raw)
  To: Kees Cook
  Cc: Thomas Garnier, kernel-hardening, Ingo Molnar, Andy Lutomirski,
	x86, Borislav Petkov, Baoquan He, Yinghai Lu, Juergen Gross,
	Matt Fleming, Toshi Kani, Andrew Morton, Dan Williams,
	Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

Hey Kees, Thomas,

On Wed, Jun 22, 2016 at 10:05:51AM -0700, Kees Cook wrote:
> On Wed, Jun 22, 2016 at 8:59 AM, Thomas Garnier <thgarnie@google.com> wrote:
> > On Wed, Jun 22, 2016 at 5:47 AM, Jason Cooper <jason@lakedaemon.net> wrote:
> >> Hey Kees,
> >>
> >> On Tue, Jun 21, 2016 at 05:46:57PM -0700, Kees Cook wrote:
> >>> Notable problems that needed solving:
> >> ...
> >>>  - Reasonable entropy is needed early at boot before get_random_bytes()
> >>>    is available.
> >>
> >> This series is targetting x86, which typically has RDRAND/RDSEED
> >> instructions.  Are you referring to other arches?  Older x86?  Also,
> >> isn't this the same requirement for base address KASLR?
> >>
> >> Don't get me wrong, I want more diverse entropy sources available
> >> earlier in the boot process as well. :-)  I'm just wondering what's
> >> different about this series vs base address KASLR wrt early entropy
> >> sources.
> >>
> >
> > I think Kees was referring to the refactor I did to get the similar
> > entropy generation than KASLR module randomization. Our approach was
> > to provide best entropy possible even if you have an older processor
> > or under virtualization without support for these instructions.
> > Unfortunately common on companies with a large number of older
> > machines.
> 
> Right, the memory offset KASLR uses the same routines as the kernel
> base KASLR. The issue is with older x86 systems, which continue to be
> very common.

We have the same issue in embedded. :-(  Compounded by the fact that
there is no rand instruction (at least not on ARM).  So, even if there's
a HW-RNG, you can't access it until the driver is loaded.

This is compounded by the fact that most systems deployed today have
bootloaders a) without hw-rng drivers, b) without dtb editing, and c)
without dtb support at all.

My current thinking is to add a devicetree property
"userspace,random-seed" <address, len>.  This way, existing, deployed
boards can append a dtb to a modern kernel with the property set.
The factory bootloader then only needs to amend its boot scripts to read
random-seed from the fs to the given address.

Modern systems that receive a seed from the bootloader via the
random-seed property (typically from the hw-rng) can mix both sources
for increased resilience.

Unfortunately, I'm not very familiar with the internals of x86
bootstrapping.  Could GRUB be scripted to do a similar task?  How would
the address and size of the seed be passed to the kernel?  command line?

thx,

Jason.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
@ 2016-06-23 19:33         ` Jason Cooper
  0 siblings, 0 replies; 74+ messages in thread
From: Jason Cooper @ 2016-06-23 19:33 UTC (permalink / raw)
  To: Kees Cook
  Cc: Thomas Garnier, kernel-hardening, Ingo Molnar, Andy Lutomirski,
	x86, Borislav Petkov, Baoquan He, Yinghai Lu, Juergen Gross,
	Matt Fleming, Toshi Kani, Andrew Morton, Dan Williams,
	Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

Hey Kees, Thomas,

On Wed, Jun 22, 2016 at 10:05:51AM -0700, Kees Cook wrote:
> On Wed, Jun 22, 2016 at 8:59 AM, Thomas Garnier <thgarnie@google.com> wrote:
> > On Wed, Jun 22, 2016 at 5:47 AM, Jason Cooper <jason@lakedaemon.net> wrote:
> >> Hey Kees,
> >>
> >> On Tue, Jun 21, 2016 at 05:46:57PM -0700, Kees Cook wrote:
> >>> Notable problems that needed solving:
> >> ...
> >>>  - Reasonable entropy is needed early at boot before get_random_bytes()
> >>>    is available.
> >>
> >> This series is targetting x86, which typically has RDRAND/RDSEED
> >> instructions.  Are you referring to other arches?  Older x86?  Also,
> >> isn't this the same requirement for base address KASLR?
> >>
> >> Don't get me wrong, I want more diverse entropy sources available
> >> earlier in the boot process as well. :-)  I'm just wondering what's
> >> different about this series vs base address KASLR wrt early entropy
> >> sources.
> >>
> >
> > I think Kees was referring to the refactor I did to get the similar
> > entropy generation than KASLR module randomization. Our approach was
> > to provide best entropy possible even if you have an older processor
> > or under virtualization without support for these instructions.
> > Unfortunately common on companies with a large number of older
> > machines.
> 
> Right, the memory offset KASLR uses the same routines as the kernel
> base KASLR. The issue is with older x86 systems, which continue to be
> very common.

We have the same issue in embedded. :-(  Compounded by the fact that
there is no rand instruction (at least not on ARM).  So, even if there's
a HW-RNG, you can't access it until the driver is loaded.

This is compounded by the fact that most systems deployed today have
bootloaders a) without hw-rng drivers, b) without dtb editing, and c)
without dtb support at all.

My current thinking is to add a devicetree property
"userspace,random-seed" <address, len>.  This way, existing, deployed
boards can append a dtb to a modern kernel with the property set.
The factory bootloader then only needs to amend its boot scripts to read
random-seed from the fs to the given address.

Modern systems that receive a seed from the bootloader via the
random-seed property (typically from the hw-rng) can mix both sources
for increased resilience.

Unfortunately, I'm not very familiar with the internals of x86
bootstrapping.  Could GRUB be scripted to do a similar task?  How would
the address and size of the seed be passed to the kernel?  command line?

thx,

Jason.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
  2016-06-23 19:33         ` Jason Cooper
  (?)
@ 2016-06-23 19:45         ` Sandy Harris
  2016-06-23 19:59             ` Kees Cook
  2016-06-23 20:16           ` Jason Cooper
  -1 siblings, 2 replies; 74+ messages in thread
From: Sandy Harris @ 2016-06-23 19:45 UTC (permalink / raw)
  To: kernel-hardening
  Cc: Kees Cook, Thomas Garnier, Ingo Molnar, Andy Lutomirski, x86,
	Borislav Petkov, Baoquan He, Yinghai Lu, Juergen Gross,
	Matt Fleming, Toshi Kani, Andrew Morton, Dan Williams,
	Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

Jason Cooper <jason@lakedaemon.net> wrote:

> Modern systems that receive a seed from the bootloader via the
> random-seed property (typically from the hw-rng) can mix both sources
> for increased resilience.
>
> Unfortunately, I'm not very familiar with the internals of x86
> bootstrapping.  Could GRUB be scripted to do a similar task?  How would
> the address and size of the seed be passed to the kernel?  command line?

One suggestion is at:
http://www.av8n.com/computer/htm/secure-random.htm#sec-boot-image

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
  2016-06-23 19:33         ` Jason Cooper
@ 2016-06-23 19:58           ` Kees Cook
  -1 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-23 19:58 UTC (permalink / raw)
  To: Jason Cooper, Ard Biesheuvel
  Cc: Thomas Garnier, kernel-hardening, Ingo Molnar, Andy Lutomirski,
	x86, Borislav Petkov, Baoquan He, Yinghai Lu, Juergen Gross,
	Matt Fleming, Toshi Kani, Andrew Morton, Dan Williams,
	Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

On Thu, Jun 23, 2016 at 12:33 PM, Jason Cooper <jason@lakedaemon.net> wrote:
> Hey Kees, Thomas,
>
> On Wed, Jun 22, 2016 at 10:05:51AM -0700, Kees Cook wrote:
>> On Wed, Jun 22, 2016 at 8:59 AM, Thomas Garnier <thgarnie@google.com> wrote:
>> > On Wed, Jun 22, 2016 at 5:47 AM, Jason Cooper <jason@lakedaemon.net> wrote:
>> >> Hey Kees,
>> >>
>> >> On Tue, Jun 21, 2016 at 05:46:57PM -0700, Kees Cook wrote:
>> >>> Notable problems that needed solving:
>> >> ...
>> >>>  - Reasonable entropy is needed early at boot before get_random_bytes()
>> >>>    is available.
>> >>
>> >> This series is targetting x86, which typically has RDRAND/RDSEED
>> >> instructions.  Are you referring to other arches?  Older x86?  Also,
>> >> isn't this the same requirement for base address KASLR?
>> >>
>> >> Don't get me wrong, I want more diverse entropy sources available
>> >> earlier in the boot process as well. :-)  I'm just wondering what's
>> >> different about this series vs base address KASLR wrt early entropy
>> >> sources.
>> >>
>> >
>> > I think Kees was referring to the refactor I did to get the similar
>> > entropy generation than KASLR module randomization. Our approach was
>> > to provide best entropy possible even if you have an older processor
>> > or under virtualization without support for these instructions.
>> > Unfortunately common on companies with a large number of older
>> > machines.
>>
>> Right, the memory offset KASLR uses the same routines as the kernel
>> base KASLR. The issue is with older x86 systems, which continue to be
>> very common.
>
> We have the same issue in embedded. :-(  Compounded by the fact that
> there is no rand instruction (at least not on ARM).  So, even if there's
> a HW-RNG, you can't access it until the driver is loaded.
>
> This is compounded by the fact that most systems deployed today have
> bootloaders a) without hw-rng drivers, b) without dtb editing, and c)
> without dtb support at all.
>
> My current thinking is to add a devicetree property
> "userspace,random-seed" <address, len>.  This way, existing, deployed
> boards can append a dtb to a modern kernel with the property set.
> The factory bootloader then only needs to amend its boot scripts to read
> random-seed from the fs to the given address.

The arm64 KASLR implementation has defined a way for boot loaders to
pass in an seed similar to this. It might be nice to have a fall-back
to a DT entry, though, then the bootloaders don't need to changed.

Ard might have some thoughts on why DT wasn't used for KASLR (I assume
the early parsing overhead, but I don't remember the discussion any
more).

> Modern systems that receive a seed from the bootloader via the
> random-seed property (typically from the hw-rng) can mix both sources
> for increased resilience.

Yeah, that could work.

> Unfortunately, I'm not very familiar with the internals of x86
> bootstrapping.  Could GRUB be scripted to do a similar task?  How would
> the address and size of the seed be passed to the kernel?  command line?

Command line could work (though it would need scrubbing to avoid it
leaking into /proc/cmdine), but there's also the "zero-page" used by
bootloaders to pass details to the kernel (see
Documentation/x86/boot.txt). Right now, x86 has sufficient entropy
(though rdrand is best).

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
@ 2016-06-23 19:58           ` Kees Cook
  0 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-23 19:58 UTC (permalink / raw)
  To: Jason Cooper, Ard Biesheuvel
  Cc: Thomas Garnier, kernel-hardening, Ingo Molnar, Andy Lutomirski,
	x86, Borislav Petkov, Baoquan He, Yinghai Lu, Juergen Gross,
	Matt Fleming, Toshi Kani, Andrew Morton, Dan Williams,
	Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

On Thu, Jun 23, 2016 at 12:33 PM, Jason Cooper <jason@lakedaemon.net> wrote:
> Hey Kees, Thomas,
>
> On Wed, Jun 22, 2016 at 10:05:51AM -0700, Kees Cook wrote:
>> On Wed, Jun 22, 2016 at 8:59 AM, Thomas Garnier <thgarnie@google.com> wrote:
>> > On Wed, Jun 22, 2016 at 5:47 AM, Jason Cooper <jason@lakedaemon.net> wrote:
>> >> Hey Kees,
>> >>
>> >> On Tue, Jun 21, 2016 at 05:46:57PM -0700, Kees Cook wrote:
>> >>> Notable problems that needed solving:
>> >> ...
>> >>>  - Reasonable entropy is needed early at boot before get_random_bytes()
>> >>>    is available.
>> >>
>> >> This series is targetting x86, which typically has RDRAND/RDSEED
>> >> instructions.  Are you referring to other arches?  Older x86?  Also,
>> >> isn't this the same requirement for base address KASLR?
>> >>
>> >> Don't get me wrong, I want more diverse entropy sources available
>> >> earlier in the boot process as well. :-)  I'm just wondering what's
>> >> different about this series vs base address KASLR wrt early entropy
>> >> sources.
>> >>
>> >
>> > I think Kees was referring to the refactor I did to get the similar
>> > entropy generation than KASLR module randomization. Our approach was
>> > to provide best entropy possible even if you have an older processor
>> > or under virtualization without support for these instructions.
>> > Unfortunately common on companies with a large number of older
>> > machines.
>>
>> Right, the memory offset KASLR uses the same routines as the kernel
>> base KASLR. The issue is with older x86 systems, which continue to be
>> very common.
>
> We have the same issue in embedded. :-(  Compounded by the fact that
> there is no rand instruction (at least not on ARM).  So, even if there's
> a HW-RNG, you can't access it until the driver is loaded.
>
> This is compounded by the fact that most systems deployed today have
> bootloaders a) without hw-rng drivers, b) without dtb editing, and c)
> without dtb support at all.
>
> My current thinking is to add a devicetree property
> "userspace,random-seed" <address, len>.  This way, existing, deployed
> boards can append a dtb to a modern kernel with the property set.
> The factory bootloader then only needs to amend its boot scripts to read
> random-seed from the fs to the given address.

The arm64 KASLR implementation has defined a way for boot loaders to
pass in an seed similar to this. It might be nice to have a fall-back
to a DT entry, though, then the bootloaders don't need to changed.

Ard might have some thoughts on why DT wasn't used for KASLR (I assume
the early parsing overhead, but I don't remember the discussion any
more).

> Modern systems that receive a seed from the bootloader via the
> random-seed property (typically from the hw-rng) can mix both sources
> for increased resilience.

Yeah, that could work.

> Unfortunately, I'm not very familiar with the internals of x86
> bootstrapping.  Could GRUB be scripted to do a similar task?  How would
> the address and size of the seed be passed to the kernel?  command line?

Command line could work (though it would need scrubbing to avoid it
leaking into /proc/cmdine), but there's also the "zero-page" used by
bootloaders to pass details to the kernel (see
Documentation/x86/boot.txt). Right now, x86 has sufficient entropy
(though rdrand is best).

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
  2016-06-23 19:45         ` Sandy Harris
@ 2016-06-23 19:59             ` Kees Cook
  2016-06-23 20:16           ` Jason Cooper
  1 sibling, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-23 19:59 UTC (permalink / raw)
  To: Sandy Harris
  Cc: kernel-hardening, Thomas Garnier, Ingo Molnar, Andy Lutomirski,
	x86, Borislav Petkov, Baoquan He, Yinghai Lu, Juergen Gross,
	Matt Fleming, Toshi Kani, Andrew Morton, Dan Williams,
	Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

On Thu, Jun 23, 2016 at 12:45 PM, Sandy Harris <sandyinchina@gmail.com> wrote:
> Jason Cooper <jason@lakedaemon.net> wrote:
>
>> Modern systems that receive a seed from the bootloader via the
>> random-seed property (typically from the hw-rng) can mix both sources
>> for increased resilience.
>>
>> Unfortunately, I'm not very familiar with the internals of x86
>> bootstrapping.  Could GRUB be scripted to do a similar task?  How would
>> the address and size of the seed be passed to the kernel?  command line?
>
> One suggestion is at:
> http://www.av8n.com/computer/htm/secure-random.htm#sec-boot-image

Interesting! This might pose a problem for signed images, though.
(Actually, for signed arm kernels is the DT signed too? If so, it
would be a similar problem.)

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
@ 2016-06-23 19:59             ` Kees Cook
  0 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-23 19:59 UTC (permalink / raw)
  To: Sandy Harris
  Cc: kernel-hardening, Thomas Garnier, Ingo Molnar, Andy Lutomirski,
	x86, Borislav Petkov, Baoquan He, Yinghai Lu, Juergen Gross,
	Matt Fleming, Toshi Kani, Andrew Morton, Dan Williams,
	Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

On Thu, Jun 23, 2016 at 12:45 PM, Sandy Harris <sandyinchina@gmail.com> wrote:
> Jason Cooper <jason@lakedaemon.net> wrote:
>
>> Modern systems that receive a seed from the bootloader via the
>> random-seed property (typically from the hw-rng) can mix both sources
>> for increased resilience.
>>
>> Unfortunately, I'm not very familiar with the internals of x86
>> bootstrapping.  Could GRUB be scripted to do a similar task?  How would
>> the address and size of the seed be passed to the kernel?  command line?
>
> One suggestion is at:
> http://www.av8n.com/computer/htm/secure-random.htm#sec-boot-image

Interesting! This might pose a problem for signed images, though.
(Actually, for signed arm kernels is the DT signed too? If so, it
would be a similar problem.)

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
  2016-06-23 19:58           ` Kees Cook
@ 2016-06-23 20:05             ` Ard Biesheuvel
  -1 siblings, 0 replies; 74+ messages in thread
From: Ard Biesheuvel @ 2016-06-23 20:05 UTC (permalink / raw)
  To: Kees Cook
  Cc: Jason Cooper, Thomas Garnier, kernel-hardening, Ingo Molnar,
	Andy Lutomirski, x86, Borislav Petkov, Baoquan He, Yinghai Lu,
	Juergen Gross, Matt Fleming, Toshi Kani, Andrew Morton,
	Dan Williams, Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

On 23 June 2016 at 21:58, Kees Cook <keescook@chromium.org> wrote:
> On Thu, Jun 23, 2016 at 12:33 PM, Jason Cooper <jason@lakedaemon.net> wrote:
>> Hey Kees, Thomas,
>>
>> On Wed, Jun 22, 2016 at 10:05:51AM -0700, Kees Cook wrote:
>>> On Wed, Jun 22, 2016 at 8:59 AM, Thomas Garnier <thgarnie@google.com> wrote:
>>> > On Wed, Jun 22, 2016 at 5:47 AM, Jason Cooper <jason@lakedaemon.net> wrote:
>>> >> Hey Kees,
>>> >>
>>> >> On Tue, Jun 21, 2016 at 05:46:57PM -0700, Kees Cook wrote:
>>> >>> Notable problems that needed solving:
>>> >> ...
>>> >>>  - Reasonable entropy is needed early at boot before get_random_bytes()
>>> >>>    is available.
>>> >>
>>> >> This series is targetting x86, which typically has RDRAND/RDSEED
>>> >> instructions.  Are you referring to other arches?  Older x86?  Also,
>>> >> isn't this the same requirement for base address KASLR?
>>> >>
>>> >> Don't get me wrong, I want more diverse entropy sources available
>>> >> earlier in the boot process as well. :-)  I'm just wondering what's
>>> >> different about this series vs base address KASLR wrt early entropy
>>> >> sources.
>>> >>
>>> >
>>> > I think Kees was referring to the refactor I did to get the similar
>>> > entropy generation than KASLR module randomization. Our approach was
>>> > to provide best entropy possible even if you have an older processor
>>> > or under virtualization without support for these instructions.
>>> > Unfortunately common on companies with a large number of older
>>> > machines.
>>>
>>> Right, the memory offset KASLR uses the same routines as the kernel
>>> base KASLR. The issue is with older x86 systems, which continue to be
>>> very common.
>>
>> We have the same issue in embedded. :-(  Compounded by the fact that
>> there is no rand instruction (at least not on ARM).  So, even if there's
>> a HW-RNG, you can't access it until the driver is loaded.
>>
>> This is compounded by the fact that most systems deployed today have
>> bootloaders a) without hw-rng drivers, b) without dtb editing, and c)
>> without dtb support at all.
>>
>> My current thinking is to add a devicetree property
>> "userspace,random-seed" <address, len>.  This way, existing, deployed
>> boards can append a dtb to a modern kernel with the property set.
>> The factory bootloader then only needs to amend its boot scripts to read
>> random-seed from the fs to the given address.
>
> The arm64 KASLR implementation has defined a way for boot loaders to
> pass in an seed similar to this. It might be nice to have a fall-back
> to a DT entry, though, then the bootloaders don't need to changed.
>
> Ard might have some thoughts on why DT wasn't used for KASLR (I assume
> the early parsing overhead, but I don't remember the discussion any
> more).
>

On arm64, only DT is used for KASLR (even when booting via ACPI). My
first draft used register x1, but this turned out to be too much of a
hassle, since parsing the DT is also necessary to discover whether
there is a 'nokaslr' argument on the kernel command line. So the
current implementation only supports a single method, which is the
/chosen/kaslr-seed uint64 property.

>> Modern systems that receive a seed from the bootloader via the
>> random-seed property (typically from the hw-rng) can mix both sources
>> for increased resilience.
>
> Yeah, that could work.
>
>> Unfortunately, I'm not very familiar with the internals of x86
>> bootstrapping.  Could GRUB be scripted to do a similar task?  How would
>> the address and size of the seed be passed to the kernel?  command line?
>
> Command line could work (though it would need scrubbing to avoid it
> leaking into /proc/cmdine), but there's also the "zero-page" used by
> bootloaders to pass details to the kernel (see
> Documentation/x86/boot.txt). Right now, x86 has sufficient entropy
> (though rdrand is best).
>
> -Kees
>
> --
> Kees Cook
> Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
@ 2016-06-23 20:05             ` Ard Biesheuvel
  0 siblings, 0 replies; 74+ messages in thread
From: Ard Biesheuvel @ 2016-06-23 20:05 UTC (permalink / raw)
  To: Kees Cook
  Cc: Jason Cooper, Thomas Garnier, kernel-hardening, Ingo Molnar,
	Andy Lutomirski, x86, Borislav Petkov, Baoquan He, Yinghai Lu,
	Juergen Gross, Matt Fleming, Toshi Kani, Andrew Morton,
	Dan Williams, Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

On 23 June 2016 at 21:58, Kees Cook <keescook@chromium.org> wrote:
> On Thu, Jun 23, 2016 at 12:33 PM, Jason Cooper <jason@lakedaemon.net> wrote:
>> Hey Kees, Thomas,
>>
>> On Wed, Jun 22, 2016 at 10:05:51AM -0700, Kees Cook wrote:
>>> On Wed, Jun 22, 2016 at 8:59 AM, Thomas Garnier <thgarnie@google.com> wrote:
>>> > On Wed, Jun 22, 2016 at 5:47 AM, Jason Cooper <jason@lakedaemon.net> wrote:
>>> >> Hey Kees,
>>> >>
>>> >> On Tue, Jun 21, 2016 at 05:46:57PM -0700, Kees Cook wrote:
>>> >>> Notable problems that needed solving:
>>> >> ...
>>> >>>  - Reasonable entropy is needed early at boot before get_random_bytes()
>>> >>>    is available.
>>> >>
>>> >> This series is targetting x86, which typically has RDRAND/RDSEED
>>> >> instructions.  Are you referring to other arches?  Older x86?  Also,
>>> >> isn't this the same requirement for base address KASLR?
>>> >>
>>> >> Don't get me wrong, I want more diverse entropy sources available
>>> >> earlier in the boot process as well. :-)  I'm just wondering what's
>>> >> different about this series vs base address KASLR wrt early entropy
>>> >> sources.
>>> >>
>>> >
>>> > I think Kees was referring to the refactor I did to get the similar
>>> > entropy generation than KASLR module randomization. Our approach was
>>> > to provide best entropy possible even if you have an older processor
>>> > or under virtualization without support for these instructions.
>>> > Unfortunately common on companies with a large number of older
>>> > machines.
>>>
>>> Right, the memory offset KASLR uses the same routines as the kernel
>>> base KASLR. The issue is with older x86 systems, which continue to be
>>> very common.
>>
>> We have the same issue in embedded. :-(  Compounded by the fact that
>> there is no rand instruction (at least not on ARM).  So, even if there's
>> a HW-RNG, you can't access it until the driver is loaded.
>>
>> This is compounded by the fact that most systems deployed today have
>> bootloaders a) without hw-rng drivers, b) without dtb editing, and c)
>> without dtb support at all.
>>
>> My current thinking is to add a devicetree property
>> "userspace,random-seed" <address, len>.  This way, existing, deployed
>> boards can append a dtb to a modern kernel with the property set.
>> The factory bootloader then only needs to amend its boot scripts to read
>> random-seed from the fs to the given address.
>
> The arm64 KASLR implementation has defined a way for boot loaders to
> pass in an seed similar to this. It might be nice to have a fall-back
> to a DT entry, though, then the bootloaders don't need to changed.
>
> Ard might have some thoughts on why DT wasn't used for KASLR (I assume
> the early parsing overhead, but I don't remember the discussion any
> more).
>

On arm64, only DT is used for KASLR (even when booting via ACPI). My
first draft used register x1, but this turned out to be too much of a
hassle, since parsing the DT is also necessary to discover whether
there is a 'nokaslr' argument on the kernel command line. So the
current implementation only supports a single method, which is the
/chosen/kaslr-seed uint64 property.

>> Modern systems that receive a seed from the bootloader via the
>> random-seed property (typically from the hw-rng) can mix both sources
>> for increased resilience.
>
> Yeah, that could work.
>
>> Unfortunately, I'm not very familiar with the internals of x86
>> bootstrapping.  Could GRUB be scripted to do a similar task?  How would
>> the address and size of the seed be passed to the kernel?  command line?
>
> Command line could work (though it would need scrubbing to avoid it
> leaking into /proc/cmdine), but there's also the "zero-page" used by
> bootloaders to pass details to the kernel (see
> Documentation/x86/boot.txt). Right now, x86 has sufficient entropy
> (though rdrand is best).
>
> -Kees
>
> --
> Kees Cook
> Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
  2016-06-23 19:45         ` Sandy Harris
  2016-06-23 19:59             ` Kees Cook
@ 2016-06-23 20:16           ` Jason Cooper
  1 sibling, 0 replies; 74+ messages in thread
From: Jason Cooper @ 2016-06-23 20:16 UTC (permalink / raw)
  To: kernel-hardening
  Cc: Kees Cook, Thomas Garnier, Ingo Molnar, Andy Lutomirski, x86,
	Borislav Petkov, Baoquan He, Yinghai Lu, Juergen Gross,
	Matt Fleming, Toshi Kani, Andrew Morton, Dan Williams,
	Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

Hey Sandy,

On Thu, Jun 23, 2016 at 03:45:54PM -0400, Sandy Harris wrote:
> Jason Cooper <jason@lakedaemon.net> wrote:
> 
> > Modern systems that receive a seed from the bootloader via the
> > random-seed property (typically from the hw-rng) can mix both sources
> > for increased resilience.
> >
> > Unfortunately, I'm not very familiar with the internals of x86
> > bootstrapping.  Could GRUB be scripted to do a similar task?  How would
> > the address and size of the seed be passed to the kernel?  command line?
> 
> One suggestion is at:
> http://www.av8n.com/computer/htm/secure-random.htm#sec-boot-image

Yes, this is very similar to the latent_entropy series that I think Kees
just merged.  Well, at a high level, it is.  'store a seed in the
kernel, use it at reboot'.

These approaches are good in that they provide yet another source of
entropy to the kernel.  However, both suffer from the kernel binary
being very static in time and across distro installs.  Particularly with
embedded systems.  It almost becomes a long term secret.  Which, the
longer it lives, the less chance there is of it being secret.

I'm not really comfortable with what John suggests, here:

"""
Next step: It should be straightforward to write a tool that efficiently
updates the stored seed within the boot image. Updating MUST occur
during provisioning, before the device gets booted for the first time
... and also from time to time thereafter. Updating the boot image isn’t
be quite as simple as dd of=/var/lib/urandom/random-seed but neither is
it rocket surgery. The cost is utterly negligible compared to the cost
of a security breach, which is the relevant comparison.
"""

Editing the installed kernel binary to add the seed is exposing the
system to unnecessary risk of bricking the system (e.g. powerfail
 halfway through) [0].  Yes, this can be mitigated by following a similar
process to kernel updates, but why?  The bootloader already knows how to
read a file into RAM.  We just need to put it in the right place and
tell it to do so.  And userspace already writes a new random-seed during
system init and clean shutdown.

We just need to connect the dots so deployed systems can use the seed
earlier without having to hack the kernel or update the bootloader.
Which, while possible, a lot of folks are skittish to do.

thx,

Jason.

[0] I imagine it also borks code-signing...

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
  2016-06-23 19:59             ` Kees Cook
  (?)
@ 2016-06-23 20:19             ` Jason Cooper
  -1 siblings, 0 replies; 74+ messages in thread
From: Jason Cooper @ 2016-06-23 20:19 UTC (permalink / raw)
  To: kernel-hardening
  Cc: Sandy Harris, Thomas Garnier, Ingo Molnar, Andy Lutomirski, x86,
	Borislav Petkov, Baoquan He, Yinghai Lu, Juergen Gross,
	Matt Fleming, Toshi Kani, Andrew Morton, Dan Williams,
	Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

On Thu, Jun 23, 2016 at 12:59:07PM -0700, Kees Cook wrote:
> On Thu, Jun 23, 2016 at 12:45 PM, Sandy Harris <sandyinchina@gmail.com> wrote:
> > Jason Cooper <jason@lakedaemon.net> wrote:
> >
> >> Modern systems that receive a seed from the bootloader via the
> >> random-seed property (typically from the hw-rng) can mix both sources
> >> for increased resilience.
> >>
> >> Unfortunately, I'm not very familiar with the internals of x86
> >> bootstrapping.  Could GRUB be scripted to do a similar task?  How would
> >> the address and size of the seed be passed to the kernel?  command line?
> >
> > One suggestion is at:
> > http://www.av8n.com/computer/htm/secure-random.htm#sec-boot-image
> 
> Interesting! This might pose a problem for signed images, though.
> (Actually, for signed arm kernels is the DT signed too? If so, it
> would be a similar problem.)

That's the reason for userspace,random-seed = <address, size>.  Once
set, the dtb never has to change.  The bootloader loads the file to the
same address at each boot.  Userspace is responsible, as it is already,
for updating the random-seed file while up.

thx,

Jason.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
  2016-06-23 20:05             ` Ard Biesheuvel
@ 2016-06-24  1:11               ` Jason Cooper
  -1 siblings, 0 replies; 74+ messages in thread
From: Jason Cooper @ 2016-06-24  1:11 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Kees Cook, Thomas Garnier, kernel-hardening, Ingo Molnar,
	Andy Lutomirski, x86, Borislav Petkov, Baoquan He, Yinghai Lu,
	Juergen Gross, Matt Fleming, Toshi Kani, Andrew Morton,
	Dan Williams, Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

Hi Ard,

On Thu, Jun 23, 2016 at 10:05:53PM +0200, Ard Biesheuvel wrote:
> On 23 June 2016 at 21:58, Kees Cook <keescook@chromium.org> wrote:
> > On Thu, Jun 23, 2016 at 12:33 PM, Jason Cooper <jason@lakedaemon.net> wrote:
> >> On Wed, Jun 22, 2016 at 10:05:51AM -0700, Kees Cook wrote:
> >>> On Wed, Jun 22, 2016 at 8:59 AM, Thomas Garnier <thgarnie@google.com> wrote:
> >>> > On Wed, Jun 22, 2016 at 5:47 AM, Jason Cooper <jason@lakedaemon.net> wrote:
> >>> >> Hey Kees,
> >>> >>
> >>> >> On Tue, Jun 21, 2016 at 05:46:57PM -0700, Kees Cook wrote:
> >>> >>> Notable problems that needed solving:
> >>> >> ...
> >>> >>>  - Reasonable entropy is needed early at boot before get_random_bytes()
> >>> >>>    is available.
> >>> >>
> >>> >> This series is targetting x86, which typically has RDRAND/RDSEED
> >>> >> instructions.  Are you referring to other arches?  Older x86?  Also,
> >>> >> isn't this the same requirement for base address KASLR?
> >>> >>
> >>> >> Don't get me wrong, I want more diverse entropy sources available
> >>> >> earlier in the boot process as well. :-)  I'm just wondering what's
> >>> >> different about this series vs base address KASLR wrt early entropy
> >>> >> sources.
> >>> >>
> >>> >
> >>> > I think Kees was referring to the refactor I did to get the similar
> >>> > entropy generation than KASLR module randomization. Our approach was
> >>> > to provide best entropy possible even if you have an older processor
> >>> > or under virtualization without support for these instructions.
> >>> > Unfortunately common on companies with a large number of older
> >>> > machines.
> >>>
> >>> Right, the memory offset KASLR uses the same routines as the kernel
> >>> base KASLR. The issue is with older x86 systems, which continue to be
> >>> very common.
> >>
> >> We have the same issue in embedded. :-(  Compounded by the fact that
> >> there is no rand instruction (at least not on ARM).  So, even if there's
> >> a HW-RNG, you can't access it until the driver is loaded.
> >>
> >> This is compounded by the fact that most systems deployed today have
> >> bootloaders a) without hw-rng drivers, b) without dtb editing, and c)
> >> without dtb support at all.
> >>
> >> My current thinking is to add a devicetree property
> >> "userspace,random-seed" <address, len>.  This way, existing, deployed
> >> boards can append a dtb to a modern kernel with the property set.
> >> The factory bootloader then only needs to amend its boot scripts to read
> >> random-seed from the fs to the given address.
> >
> > The arm64 KASLR implementation has defined a way for boot loaders to
> > pass in an seed similar to this. It might be nice to have a fall-back
> > to a DT entry, though, then the bootloaders don't need to changed.
> >
> > Ard might have some thoughts on why DT wasn't used for KASLR (I assume
> > the early parsing overhead, but I don't remember the discussion any
> > more).
> >
> 
> On arm64, only DT is used for KASLR (even when booting via ACPI). My
> first draft used register x1, but this turned out to be too much of a
> hassle, since parsing the DT is also necessary to discover whether
> there is a 'nokaslr' argument on the kernel command line. So the
> current implementation only supports a single method, which is the
> /chosen/kaslr-seed uint64 property.

Ok, just to clarify (after a short offline chat), my goal is to set a
userspace,random-seed <addr, len> property in the device tree once.
The bootloader scripts would also only need to be altered once.

Then, at each boot, the bootloader reads the entirety of
/var/lib/misc/random-seed (512 bytes) into the configured address.
random-seed could be in /boot, or on a flash partition.

The decompressor would consume a small portion of that seed for kaslr
and such.  After that, the rest would be consumed by random.c to
initialize the entropy pools.

thx,

Jason.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
@ 2016-06-24  1:11               ` Jason Cooper
  0 siblings, 0 replies; 74+ messages in thread
From: Jason Cooper @ 2016-06-24  1:11 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Kees Cook, Thomas Garnier, kernel-hardening, Ingo Molnar,
	Andy Lutomirski, x86, Borislav Petkov, Baoquan He, Yinghai Lu,
	Juergen Gross, Matt Fleming, Toshi Kani, Andrew Morton,
	Dan Williams, Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

Hi Ard,

On Thu, Jun 23, 2016 at 10:05:53PM +0200, Ard Biesheuvel wrote:
> On 23 June 2016 at 21:58, Kees Cook <keescook@chromium.org> wrote:
> > On Thu, Jun 23, 2016 at 12:33 PM, Jason Cooper <jason@lakedaemon.net> wrote:
> >> On Wed, Jun 22, 2016 at 10:05:51AM -0700, Kees Cook wrote:
> >>> On Wed, Jun 22, 2016 at 8:59 AM, Thomas Garnier <thgarnie@google.com> wrote:
> >>> > On Wed, Jun 22, 2016 at 5:47 AM, Jason Cooper <jason@lakedaemon.net> wrote:
> >>> >> Hey Kees,
> >>> >>
> >>> >> On Tue, Jun 21, 2016 at 05:46:57PM -0700, Kees Cook wrote:
> >>> >>> Notable problems that needed solving:
> >>> >> ...
> >>> >>>  - Reasonable entropy is needed early at boot before get_random_bytes()
> >>> >>>    is available.
> >>> >>
> >>> >> This series is targetting x86, which typically has RDRAND/RDSEED
> >>> >> instructions.  Are you referring to other arches?  Older x86?  Also,
> >>> >> isn't this the same requirement for base address KASLR?
> >>> >>
> >>> >> Don't get me wrong, I want more diverse entropy sources available
> >>> >> earlier in the boot process as well. :-)  I'm just wondering what's
> >>> >> different about this series vs base address KASLR wrt early entropy
> >>> >> sources.
> >>> >>
> >>> >
> >>> > I think Kees was referring to the refactor I did to get the similar
> >>> > entropy generation than KASLR module randomization. Our approach was
> >>> > to provide best entropy possible even if you have an older processor
> >>> > or under virtualization without support for these instructions.
> >>> > Unfortunately common on companies with a large number of older
> >>> > machines.
> >>>
> >>> Right, the memory offset KASLR uses the same routines as the kernel
> >>> base KASLR. The issue is with older x86 systems, which continue to be
> >>> very common.
> >>
> >> We have the same issue in embedded. :-(  Compounded by the fact that
> >> there is no rand instruction (at least not on ARM).  So, even if there's
> >> a HW-RNG, you can't access it until the driver is loaded.
> >>
> >> This is compounded by the fact that most systems deployed today have
> >> bootloaders a) without hw-rng drivers, b) without dtb editing, and c)
> >> without dtb support at all.
> >>
> >> My current thinking is to add a devicetree property
> >> "userspace,random-seed" <address, len>.  This way, existing, deployed
> >> boards can append a dtb to a modern kernel with the property set.
> >> The factory bootloader then only needs to amend its boot scripts to read
> >> random-seed from the fs to the given address.
> >
> > The arm64 KASLR implementation has defined a way for boot loaders to
> > pass in an seed similar to this. It might be nice to have a fall-back
> > to a DT entry, though, then the bootloaders don't need to changed.
> >
> > Ard might have some thoughts on why DT wasn't used for KASLR (I assume
> > the early parsing overhead, but I don't remember the discussion any
> > more).
> >
> 
> On arm64, only DT is used for KASLR (even when booting via ACPI). My
> first draft used register x1, but this turned out to be too much of a
> hassle, since parsing the DT is also necessary to discover whether
> there is a 'nokaslr' argument on the kernel command line. So the
> current implementation only supports a single method, which is the
> /chosen/kaslr-seed uint64 property.

Ok, just to clarify (after a short offline chat), my goal is to set a
userspace,random-seed <addr, len> property in the device tree once.
The bootloader scripts would also only need to be altered once.

Then, at each boot, the bootloader reads the entirety of
/var/lib/misc/random-seed (512 bytes) into the configured address.
random-seed could be in /boot, or on a flash partition.

The decompressor would consume a small portion of that seed for kaslr
and such.  After that, the rest would be consumed by random.c to
initialize the entropy pools.

thx,

Jason.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
  2016-06-24  1:11               ` Jason Cooper
@ 2016-06-24 10:54                 ` Ard Biesheuvel
  -1 siblings, 0 replies; 74+ messages in thread
From: Ard Biesheuvel @ 2016-06-24 10:54 UTC (permalink / raw)
  To: Jason Cooper
  Cc: Kees Cook, Thomas Garnier, kernel-hardening, Ingo Molnar,
	Andy Lutomirski, x86, Borislav Petkov, Baoquan He, Yinghai Lu,
	Juergen Gross, Matt Fleming, Toshi Kani, Andrew Morton,
	Dan Williams, Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

On 24 June 2016 at 03:11, Jason Cooper <jason@lakedaemon.net> wrote:
> Hi Ard,
>
> On Thu, Jun 23, 2016 at 10:05:53PM +0200, Ard Biesheuvel wrote:
>> On 23 June 2016 at 21:58, Kees Cook <keescook@chromium.org> wrote:
>> > On Thu, Jun 23, 2016 at 12:33 PM, Jason Cooper <jason@lakedaemon.net> wrote:
>> >> On Wed, Jun 22, 2016 at 10:05:51AM -0700, Kees Cook wrote:
>> >>> On Wed, Jun 22, 2016 at 8:59 AM, Thomas Garnier <thgarnie@google.com> wrote:
>> >>> > On Wed, Jun 22, 2016 at 5:47 AM, Jason Cooper <jason@lakedaemon.net> wrote:
>> >>> >> Hey Kees,
>> >>> >>
>> >>> >> On Tue, Jun 21, 2016 at 05:46:57PM -0700, Kees Cook wrote:
>> >>> >>> Notable problems that needed solving:
>> >>> >> ...
>> >>> >>>  - Reasonable entropy is needed early at boot before get_random_bytes()
>> >>> >>>    is available.
>> >>> >>
>> >>> >> This series is targetting x86, which typically has RDRAND/RDSEED
>> >>> >> instructions.  Are you referring to other arches?  Older x86?  Also,
>> >>> >> isn't this the same requirement for base address KASLR?
>> >>> >>
>> >>> >> Don't get me wrong, I want more diverse entropy sources available
>> >>> >> earlier in the boot process as well. :-)  I'm just wondering what's
>> >>> >> different about this series vs base address KASLR wrt early entropy
>> >>> >> sources.
>> >>> >>
>> >>> >
>> >>> > I think Kees was referring to the refactor I did to get the similar
>> >>> > entropy generation than KASLR module randomization. Our approach was
>> >>> > to provide best entropy possible even if you have an older processor
>> >>> > or under virtualization without support for these instructions.
>> >>> > Unfortunately common on companies with a large number of older
>> >>> > machines.
>> >>>
>> >>> Right, the memory offset KASLR uses the same routines as the kernel
>> >>> base KASLR. The issue is with older x86 systems, which continue to be
>> >>> very common.
>> >>
>> >> We have the same issue in embedded. :-(  Compounded by the fact that
>> >> there is no rand instruction (at least not on ARM).  So, even if there's
>> >> a HW-RNG, you can't access it until the driver is loaded.
>> >>
>> >> This is compounded by the fact that most systems deployed today have
>> >> bootloaders a) without hw-rng drivers, b) without dtb editing, and c)
>> >> without dtb support at all.
>> >>
>> >> My current thinking is to add a devicetree property
>> >> "userspace,random-seed" <address, len>.  This way, existing, deployed
>> >> boards can append a dtb to a modern kernel with the property set.
>> >> The factory bootloader then only needs to amend its boot scripts to read
>> >> random-seed from the fs to the given address.
>> >
>> > The arm64 KASLR implementation has defined a way for boot loaders to
>> > pass in an seed similar to this. It might be nice to have a fall-back
>> > to a DT entry, though, then the bootloaders don't need to changed.
>> >
>> > Ard might have some thoughts on why DT wasn't used for KASLR (I assume
>> > the early parsing overhead, but I don't remember the discussion any
>> > more).
>> >
>>
>> On arm64, only DT is used for KASLR (even when booting via ACPI). My
>> first draft used register x1, but this turned out to be too much of a
>> hassle, since parsing the DT is also necessary to discover whether
>> there is a 'nokaslr' argument on the kernel command line. So the
>> current implementation only supports a single method, which is the
>> /chosen/kaslr-seed uint64 property.
>
> Ok, just to clarify (after a short offline chat), my goal is to set a
> userspace,random-seed <addr, len> property in the device tree once.
> The bootloader scripts would also only need to be altered once.
>
> Then, at each boot, the bootloader reads the entirety of
> /var/lib/misc/random-seed (512 bytes) into the configured address.
> random-seed could be in /boot, or on a flash partition.
>
> The decompressor would consume a small portion of that seed for kaslr
> and such.  After that, the rest would be consumed by random.c to
> initialize the entropy pools.
>

I see. This indeed has little to do with the arm64 KASLR case, other
than that they both use a DT property.

In the arm64 KASLR case, I deliberately chose to leave it up to the
bootloader/firmware to roll the dice, for the same reason you pointed
out, i.e., that there is no architected way on ARM to obtain random
bits. So in that sense, what you are doing is complimentary to my
work, and a KASLR  aware arm64 bootloader would copy some of its
random bits taken from /var/lib/misc/random-seed into the
/chosen/kaslr-seed DT property. Note that, at the moment, this DT
property is only an internal contract between the kernel's UEFI stub
and the kernel proper, so we could still easily change that if
necessary.

Alternatively, if we go with your solution, the KASLR code should read
from the address in userspace,random-seed rather than the
/chosen/kaslr-seed property itself. (or use the former as a fallback
if the latter was not found)

-- 
Ard.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR
@ 2016-06-24 10:54                 ` Ard Biesheuvel
  0 siblings, 0 replies; 74+ messages in thread
From: Ard Biesheuvel @ 2016-06-24 10:54 UTC (permalink / raw)
  To: Jason Cooper
  Cc: Kees Cook, Thomas Garnier, kernel-hardening, Ingo Molnar,
	Andy Lutomirski, x86, Borislav Petkov, Baoquan He, Yinghai Lu,
	Juergen Gross, Matt Fleming, Toshi Kani, Andrew Morton,
	Dan Williams, Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

On 24 June 2016 at 03:11, Jason Cooper <jason@lakedaemon.net> wrote:
> Hi Ard,
>
> On Thu, Jun 23, 2016 at 10:05:53PM +0200, Ard Biesheuvel wrote:
>> On 23 June 2016 at 21:58, Kees Cook <keescook@chromium.org> wrote:
>> > On Thu, Jun 23, 2016 at 12:33 PM, Jason Cooper <jason@lakedaemon.net> wrote:
>> >> On Wed, Jun 22, 2016 at 10:05:51AM -0700, Kees Cook wrote:
>> >>> On Wed, Jun 22, 2016 at 8:59 AM, Thomas Garnier <thgarnie@google.com> wrote:
>> >>> > On Wed, Jun 22, 2016 at 5:47 AM, Jason Cooper <jason@lakedaemon.net> wrote:
>> >>> >> Hey Kees,
>> >>> >>
>> >>> >> On Tue, Jun 21, 2016 at 05:46:57PM -0700, Kees Cook wrote:
>> >>> >>> Notable problems that needed solving:
>> >>> >> ...
>> >>> >>>  - Reasonable entropy is needed early at boot before get_random_bytes()
>> >>> >>>    is available.
>> >>> >>
>> >>> >> This series is targetting x86, which typically has RDRAND/RDSEED
>> >>> >> instructions.  Are you referring to other arches?  Older x86?  Also,
>> >>> >> isn't this the same requirement for base address KASLR?
>> >>> >>
>> >>> >> Don't get me wrong, I want more diverse entropy sources available
>> >>> >> earlier in the boot process as well. :-)  I'm just wondering what's
>> >>> >> different about this series vs base address KASLR wrt early entropy
>> >>> >> sources.
>> >>> >>
>> >>> >
>> >>> > I think Kees was referring to the refactor I did to get the similar
>> >>> > entropy generation than KASLR module randomization. Our approach was
>> >>> > to provide best entropy possible even if you have an older processor
>> >>> > or under virtualization without support for these instructions.
>> >>> > Unfortunately common on companies with a large number of older
>> >>> > machines.
>> >>>
>> >>> Right, the memory offset KASLR uses the same routines as the kernel
>> >>> base KASLR. The issue is with older x86 systems, which continue to be
>> >>> very common.
>> >>
>> >> We have the same issue in embedded. :-(  Compounded by the fact that
>> >> there is no rand instruction (at least not on ARM).  So, even if there's
>> >> a HW-RNG, you can't access it until the driver is loaded.
>> >>
>> >> This is compounded by the fact that most systems deployed today have
>> >> bootloaders a) without hw-rng drivers, b) without dtb editing, and c)
>> >> without dtb support at all.
>> >>
>> >> My current thinking is to add a devicetree property
>> >> "userspace,random-seed" <address, len>.  This way, existing, deployed
>> >> boards can append a dtb to a modern kernel with the property set.
>> >> The factory bootloader then only needs to amend its boot scripts to read
>> >> random-seed from the fs to the given address.
>> >
>> > The arm64 KASLR implementation has defined a way for boot loaders to
>> > pass in an seed similar to this. It might be nice to have a fall-back
>> > to a DT entry, though, then the bootloaders don't need to changed.
>> >
>> > Ard might have some thoughts on why DT wasn't used for KASLR (I assume
>> > the early parsing overhead, but I don't remember the discussion any
>> > more).
>> >
>>
>> On arm64, only DT is used for KASLR (even when booting via ACPI). My
>> first draft used register x1, but this turned out to be too much of a
>> hassle, since parsing the DT is also necessary to discover whether
>> there is a 'nokaslr' argument on the kernel command line. So the
>> current implementation only supports a single method, which is the
>> /chosen/kaslr-seed uint64 property.
>
> Ok, just to clarify (after a short offline chat), my goal is to set a
> userspace,random-seed <addr, len> property in the device tree once.
> The bootloader scripts would also only need to be altered once.
>
> Then, at each boot, the bootloader reads the entirety of
> /var/lib/misc/random-seed (512 bytes) into the configured address.
> random-seed could be in /boot, or on a flash partition.
>
> The decompressor would consume a small portion of that seed for kaslr
> and such.  After that, the rest would be consumed by random.c to
> initialize the entropy pools.
>

I see. This indeed has little to do with the arm64 KASLR case, other
than that they both use a DT property.

In the arm64 KASLR case, I deliberately chose to leave it up to the
bootloader/firmware to roll the dice, for the same reason you pointed
out, i.e., that there is no architected way on ARM to obtain random
bits. So in that sense, what you are doing is complimentary to my
work, and a KASLR  aware arm64 bootloader would copy some of its
random bits taken from /var/lib/misc/random-seed into the
/chosen/kaslr-seed DT property. Note that, at the moment, this DT
property is only an internal contract between the kernel's UEFI stub
and the kernel proper, so we could still easily change that if
necessary.

Alternatively, if we go with your solution, the KASLR code should read
from the address in userspace,random-seed rather than the
/chosen/kaslr-seed property itself. (or use the former as a fallback
if the latter was not found)

-- 
Ard.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* devicetree random-seed properties, was: "Re: [PATCH v7 0/9] x86/mm: memory area address KASLR"
  2016-06-24 10:54                 ` Ard Biesheuvel
@ 2016-06-24 16:02                   ` Jason Cooper
  -1 siblings, 0 replies; 74+ messages in thread
From: Jason Cooper @ 2016-06-24 16:02 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Kees Cook, Thomas Garnier, kernel-hardening, Ingo Molnar,
	Andy Lutomirski, x86, Borislav Petkov, Baoquan He, Yinghai Lu,
	Juergen Gross, Matt Fleming, Toshi Kani, Andrew Morton,
	Dan Williams, Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

Thomas,

Sorry for wandering off the topic of your series.  The big take away for
me is that you and Kees are concerned about x86 systems pre-RDRAND.
Just as I'm concerned about deployed embedded systems without bootloader
support for hw-rngs and so forth.

Whatever final form the approach takes for ARM/dt, I'll make sure we can
extend it to legacy x86 systems.


Ard,

On Fri, Jun 24, 2016 at 12:54:01PM +0200, Ard Biesheuvel wrote:
> On 24 June 2016 at 03:11, Jason Cooper <jason@lakedaemon.net> wrote:
> > On Thu, Jun 23, 2016 at 10:05:53PM +0200, Ard Biesheuvel wrote:
...
> >> On arm64, only DT is used for KASLR (even when booting via ACPI). My
> >> first draft used register x1, but this turned out to be too much of a
> >> hassle, since parsing the DT is also necessary to discover whether
> >> there is a 'nokaslr' argument on the kernel command line. So the
> >> current implementation only supports a single method, which is the
> >> /chosen/kaslr-seed uint64 property.
> >
> > Ok, just to clarify (after a short offline chat), my goal is to set a
> > userspace,random-seed <addr, len> property in the device tree once.
> > The bootloader scripts would also only need to be altered once.
> >
> > Then, at each boot, the bootloader reads the entirety of
> > /var/lib/misc/random-seed (512 bytes) into the configured address.
> > random-seed could be in /boot, or on a flash partition.
> >
> > The decompressor would consume a small portion of that seed for kaslr
> > and such.  After that, the rest would be consumed by random.c to
> > initialize the entropy pools.
> >
> 
> I see. This indeed has little to do with the arm64 KASLR case, other
> than that they both use a DT property.
> 
> In the arm64 KASLR case, I deliberately chose to leave it up to the
> bootloader/firmware to roll the dice, for the same reason you pointed
> out, i.e., that there is no architected way on ARM to obtain random
> bits. So in that sense, what you are doing is complimentary to my
> work, and a KASLR  aware arm64 bootloader would copy some of its
> random bits taken from /var/lib/misc/random-seed into the
> /chosen/kaslr-seed DT property.

Here I disagree.  We have two distinct entropy sources; the hw-rng
currently feeding kaslr via the /chosen/kaslr-seed property, and the
seasoned userspace seed I propose handed in via an extra property.

Having the bootloader conflate those two sources as if they are equal
seems to muddy the waters.  I prefer to have bootloaders tell me where
they got the data rather than to hope the bootloader sourced and mixed
it well.

> Note that, at the moment, this DT property is only an internal
> contract between the kernel's UEFI stub and the kernel proper, so we
> could still easily change that if necessary.

Ideally, I'd prefer to be deliberate with the DT properties, e.g.

random-seed,hwrng     <--- bootloader reads from hw-rng
random-seed,userspace <--- bootloader reads file from us to addr

The kernel decompressor can init kaslr with only one of the two
properties populated.  If both properties are present, then the
decompressor can extract a u64 from userspace-seed and mix it with
hwrng-seed before use.

The small devicetree portion of my brain feels like 'kaslr-seed' is
telling the OS what to do with the value.  Whereas devicetree is
supposed to be describing the hardware.  Or, in this case, describing
the source of the data.

Given that more entropy from more sources is useful for random.c a bit
later in the boot process, it might be worth making hwrng-seed larger
than u64 as well.  This way we can potentially seed random.c from two
sources *before* init even starts.  Without having to depend on the
kernel's hw-rng driver being probed.  After all, it might not have been
built, or it could be a module that's loaded later.

I've attached a draft patch to chosen.txt.

thx,

Jason.


--------------->8---------------------------------
diff --git a/Documentation/devicetree/bindings/chosen.txt b/Documentation/devicetree/bindings/chosen.txt
index 6ae9d82d4c37..61f15f04bc0a 100644
--- a/Documentation/devicetree/bindings/chosen.txt
+++ b/Documentation/devicetree/bindings/chosen.txt
@@ -45,6 +45,52 @@ on PowerPC "stdout" if "stdout-path" is not found.  However, the
 "linux,stdout-path" and "stdout" properties are deprecated. New platforms
 should only use the "stdout-path" property.
 
+random-seed properties
+----------------------
+
+The goal of these properties are to provide an entropy seed early in the boot
+process.  Typically, this is needed by the kernel decompressor for
+initializing KASLR.  At that point, the kernel entropy pools haven't been
+initialized yet, and any hardware rng drivers haven't been loaded yet, if they
+exist.
+
+The bootloader can attain these seeds and pass them to the kernel via the
+respective properties.  The bootloader is not expected to mix or condition
+this data in any way, simply read and pass.  Either one or both properties can
+be set if the data is available.
+
+random-seed,hwrng property
+--------------------------
+
+For bootloaders with support for reading from the system's hardware random
+number generator.  The bootloader can read a chunk of data from the hw-rng
+and set it as the value for this binary blob property.
+
+/ {
+	chosen {
+		random-seed,hwrng = <0x1f 0x07 0x4d 0x91 ...>;
+	};
+};
+
+random-seed,userspace property
+------------------------------
+
+The goal of this property is to also provide backwards compatibility with
+existing systems.  The bootloaders on these deployed systems typically lack
+the ability to edit a devicetree or read from an hwrng.  The only requirement
+for a bootloader is that it be able to read a seed file generated by the
+previous boot into a pre-determined physical address and size.  This is
+typically done via boot scripting.
+
+This property can then be set in the devicetree statically and parsed by a
+modern kernel without requiring a bootloader update.
+
+/ {
+	chosen {
+		random-seed,userspace = <0x40000 0x200>;
+	};
+};
+
 linux,booted-from-kexec
 -----------------------
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [kernel-hardening] devicetree random-seed properties, was: "Re: [PATCH v7 0/9] x86/mm: memory area address KASLR"
@ 2016-06-24 16:02                   ` Jason Cooper
  0 siblings, 0 replies; 74+ messages in thread
From: Jason Cooper @ 2016-06-24 16:02 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Kees Cook, Thomas Garnier, kernel-hardening, Ingo Molnar,
	Andy Lutomirski, x86, Borislav Petkov, Baoquan He, Yinghai Lu,
	Juergen Gross, Matt Fleming, Toshi Kani, Andrew Morton,
	Dan Williams, Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

Thomas,

Sorry for wandering off the topic of your series.  The big take away for
me is that you and Kees are concerned about x86 systems pre-RDRAND.
Just as I'm concerned about deployed embedded systems without bootloader
support for hw-rngs and so forth.

Whatever final form the approach takes for ARM/dt, I'll make sure we can
extend it to legacy x86 systems.


Ard,

On Fri, Jun 24, 2016 at 12:54:01PM +0200, Ard Biesheuvel wrote:
> On 24 June 2016 at 03:11, Jason Cooper <jason@lakedaemon.net> wrote:
> > On Thu, Jun 23, 2016 at 10:05:53PM +0200, Ard Biesheuvel wrote:
...
> >> On arm64, only DT is used for KASLR (even when booting via ACPI). My
> >> first draft used register x1, but this turned out to be too much of a
> >> hassle, since parsing the DT is also necessary to discover whether
> >> there is a 'nokaslr' argument on the kernel command line. So the
> >> current implementation only supports a single method, which is the
> >> /chosen/kaslr-seed uint64 property.
> >
> > Ok, just to clarify (after a short offline chat), my goal is to set a
> > userspace,random-seed <addr, len> property in the device tree once.
> > The bootloader scripts would also only need to be altered once.
> >
> > Then, at each boot, the bootloader reads the entirety of
> > /var/lib/misc/random-seed (512 bytes) into the configured address.
> > random-seed could be in /boot, or on a flash partition.
> >
> > The decompressor would consume a small portion of that seed for kaslr
> > and such.  After that, the rest would be consumed by random.c to
> > initialize the entropy pools.
> >
> 
> I see. This indeed has little to do with the arm64 KASLR case, other
> than that they both use a DT property.
> 
> In the arm64 KASLR case, I deliberately chose to leave it up to the
> bootloader/firmware to roll the dice, for the same reason you pointed
> out, i.e., that there is no architected way on ARM to obtain random
> bits. So in that sense, what you are doing is complimentary to my
> work, and a KASLR  aware arm64 bootloader would copy some of its
> random bits taken from /var/lib/misc/random-seed into the
> /chosen/kaslr-seed DT property.

Here I disagree.  We have two distinct entropy sources; the hw-rng
currently feeding kaslr via the /chosen/kaslr-seed property, and the
seasoned userspace seed I propose handed in via an extra property.

Having the bootloader conflate those two sources as if they are equal
seems to muddy the waters.  I prefer to have bootloaders tell me where
they got the data rather than to hope the bootloader sourced and mixed
it well.

> Note that, at the moment, this DT property is only an internal
> contract between the kernel's UEFI stub and the kernel proper, so we
> could still easily change that if necessary.

Ideally, I'd prefer to be deliberate with the DT properties, e.g.

random-seed,hwrng     <--- bootloader reads from hw-rng
random-seed,userspace <--- bootloader reads file from us to addr

The kernel decompressor can init kaslr with only one of the two
properties populated.  If both properties are present, then the
decompressor can extract a u64 from userspace-seed and mix it with
hwrng-seed before use.

The small devicetree portion of my brain feels like 'kaslr-seed' is
telling the OS what to do with the value.  Whereas devicetree is
supposed to be describing the hardware.  Or, in this case, describing
the source of the data.

Given that more entropy from more sources is useful for random.c a bit
later in the boot process, it might be worth making hwrng-seed larger
than u64 as well.  This way we can potentially seed random.c from two
sources *before* init even starts.  Without having to depend on the
kernel's hw-rng driver being probed.  After all, it might not have been
built, or it could be a module that's loaded later.

I've attached a draft patch to chosen.txt.

thx,

Jason.


--------------->8---------------------------------
diff --git a/Documentation/devicetree/bindings/chosen.txt b/Documentation/devicetree/bindings/chosen.txt
index 6ae9d82d4c37..61f15f04bc0a 100644
--- a/Documentation/devicetree/bindings/chosen.txt
+++ b/Documentation/devicetree/bindings/chosen.txt
@@ -45,6 +45,52 @@ on PowerPC "stdout" if "stdout-path" is not found.  However, the
 "linux,stdout-path" and "stdout" properties are deprecated. New platforms
 should only use the "stdout-path" property.
 
+random-seed properties
+----------------------
+
+The goal of these properties are to provide an entropy seed early in the boot
+process.  Typically, this is needed by the kernel decompressor for
+initializing KASLR.  At that point, the kernel entropy pools haven't been
+initialized yet, and any hardware rng drivers haven't been loaded yet, if they
+exist.
+
+The bootloader can attain these seeds and pass them to the kernel via the
+respective properties.  The bootloader is not expected to mix or condition
+this data in any way, simply read and pass.  Either one or both properties can
+be set if the data is available.
+
+random-seed,hwrng property
+--------------------------
+
+For bootloaders with support for reading from the system's hardware random
+number generator.  The bootloader can read a chunk of data from the hw-rng
+and set it as the value for this binary blob property.
+
+/ {
+	chosen {
+		random-seed,hwrng = <0x1f 0x07 0x4d 0x91 ...>;
+	};
+};
+
+random-seed,userspace property
+------------------------------
+
+The goal of this property is to also provide backwards compatibility with
+existing systems.  The bootloaders on these deployed systems typically lack
+the ability to edit a devicetree or read from an hwrng.  The only requirement
+for a bootloader is that it be able to read a seed file generated by the
+previous boot into a pre-determined physical address and size.  This is
+typically done via boot scripting.
+
+This property can then be set in the devicetree statically and parsed by a
+modern kernel without requiring a bootloader update.
+
+/ {
+	chosen {
+		random-seed,userspace = <0x40000 0x200>;
+	};
+};
+
 linux,booted-from-kexec
 -----------------------
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: devicetree random-seed properties, was: "Re: [PATCH v7 0/9] x86/mm: memory area address KASLR"
  2016-06-24 16:02                   ` [kernel-hardening] " Jason Cooper
@ 2016-06-24 19:04                     ` Kees Cook
  -1 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-24 19:04 UTC (permalink / raw)
  To: Jason Cooper
  Cc: Ard Biesheuvel, Thomas Garnier, kernel-hardening, Ingo Molnar,
	Andy Lutomirski, x86, Borislav Petkov, Baoquan He, Yinghai Lu,
	Juergen Gross, Matt Fleming, Toshi Kani, Andrew Morton,
	Dan Williams, Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

On Fri, Jun 24, 2016 at 9:02 AM, Jason Cooper <jason@lakedaemon.net> wrote:
> Thomas,
>
> Sorry for wandering off the topic of your series.  The big take away for
> me is that you and Kees are concerned about x86 systems pre-RDRAND.
> Just as I'm concerned about deployed embedded systems without bootloader
> support for hw-rngs and so forth.
>
> Whatever final form the approach takes for ARM/dt, I'll make sure we can
> extend it to legacy x86 systems.

Yeah, this seems like a productive conversation to me. :)

> Ard,
>
> On Fri, Jun 24, 2016 at 12:54:01PM +0200, Ard Biesheuvel wrote:
>> On 24 June 2016 at 03:11, Jason Cooper <jason@lakedaemon.net> wrote:
>> > On Thu, Jun 23, 2016 at 10:05:53PM +0200, Ard Biesheuvel wrote:
> ...
>> >> On arm64, only DT is used for KASLR (even when booting via ACPI). My
>> >> first draft used register x1, but this turned out to be too much of a
>> >> hassle, since parsing the DT is also necessary to discover whether
>> >> there is a 'nokaslr' argument on the kernel command line. So the
>> >> current implementation only supports a single method, which is the
>> >> /chosen/kaslr-seed uint64 property.
>> >
>> > Ok, just to clarify (after a short offline chat), my goal is to set a
>> > userspace,random-seed <addr, len> property in the device tree once.
>> > The bootloader scripts would also only need to be altered once.
>> >
>> > Then, at each boot, the bootloader reads the entirety of
>> > /var/lib/misc/random-seed (512 bytes) into the configured address.
>> > random-seed could be in /boot, or on a flash partition.
>> >
>> > The decompressor would consume a small portion of that seed for kaslr
>> > and such.  After that, the rest would be consumed by random.c to
>> > initialize the entropy pools.
>> >
>>
>> I see. This indeed has little to do with the arm64 KASLR case, other
>> than that they both use a DT property.
>>
>> In the arm64 KASLR case, I deliberately chose to leave it up to the
>> bootloader/firmware to roll the dice, for the same reason you pointed
>> out, i.e., that there is no architected way on ARM to obtain random
>> bits. So in that sense, what you are doing is complimentary to my
>> work, and a KASLR  aware arm64 bootloader would copy some of its
>> random bits taken from /var/lib/misc/random-seed into the
>> /chosen/kaslr-seed DT property.
>
> Here I disagree.  We have two distinct entropy sources; the hw-rng
> currently feeding kaslr via the /chosen/kaslr-seed property, and the
> seasoned userspace seed I propose handed in via an extra property.
>
> Having the bootloader conflate those two sources as if they are equal
> seems to muddy the waters.  I prefer to have bootloaders tell me where
> they got the data rather than to hope the bootloader sourced and mixed
> it well.
>
>> Note that, at the moment, this DT property is only an internal
>> contract between the kernel's UEFI stub and the kernel proper, so we
>> could still easily change that if necessary.
>
> Ideally, I'd prefer to be deliberate with the DT properties, e.g.
>
> random-seed,hwrng     <--- bootloader reads from hw-rng
> random-seed,userspace <--- bootloader reads file from us to addr
>
> The kernel decompressor can init kaslr with only one of the two
> properties populated.  If both properties are present, then the
> decompressor can extract a u64 from userspace-seed and mix it with
> hwrng-seed before use.
>
> The small devicetree portion of my brain feels like 'kaslr-seed' is
> telling the OS what to do with the value.  Whereas devicetree is
> supposed to be describing the hardware.  Or, in this case, describing
> the source of the data.
>
> Given that more entropy from more sources is useful for random.c a bit
> later in the boot process, it might be worth making hwrng-seed larger
> than u64 as well.  This way we can potentially seed random.c from two
> sources *before* init even starts.  Without having to depend on the
> kernel's hw-rng driver being probed.  After all, it might not have been
> built, or it could be a module that's loaded later.
>
> I've attached a draft patch to chosen.txt.
>
> thx,
>
> Jason.
>
>
> --------------->8---------------------------------
> diff --git a/Documentation/devicetree/bindings/chosen.txt b/Documentation/devicetree/bindings/chosen.txt
> index 6ae9d82d4c37..61f15f04bc0a 100644
> --- a/Documentation/devicetree/bindings/chosen.txt
> +++ b/Documentation/devicetree/bindings/chosen.txt
> @@ -45,6 +45,52 @@ on PowerPC "stdout" if "stdout-path" is not found.  However, the
>  "linux,stdout-path" and "stdout" properties are deprecated. New platforms
>  should only use the "stdout-path" property.
>
> +random-seed properties
> +----------------------
> +
> +The goal of these properties are to provide an entropy seed early in the boot
> +process.  Typically, this is needed by the kernel decompressor for
> +initializing KASLR.  At that point, the kernel entropy pools haven't been
> +initialized yet, and any hardware rng drivers haven't been loaded yet, if they
> +exist.
> +
> +The bootloader can attain these seeds and pass them to the kernel via the
> +respective properties.  The bootloader is not expected to mix or condition
> +this data in any way, simply read and pass.  Either one or both properties can
> +be set if the data is available.
> +
> +random-seed,hwrng property
> +--------------------------
> +
> +For bootloaders with support for reading from the system's hardware random
> +number generator.  The bootloader can read a chunk of data from the hw-rng
> +and set it as the value for this binary blob property.

As in the boot loader would change the value per-boot?

Does this proposal include replacing /chosen/kaslr-seed with
random-seed,hwrng? (Should the "chosen" path be used for hwrng too?)

> +
> +/ {
> +       chosen {
> +               random-seed,hwrng = <0x1f 0x07 0x4d 0x91 ...>;
> +       };
> +};
> +
> +random-seed,userspace property
> +------------------------------
> +
> +The goal of this property is to also provide backwards compatibility with
> +existing systems.  The bootloaders on these deployed systems typically lack
> +the ability to edit a devicetree or read from an hwrng.  The only requirement
> +for a bootloader is that it be able to read a seed file generated by the
> +previous boot into a pre-determined physical address and size.  This is
> +typically done via boot scripting.

What happens on a cold boot?

> +
> +This property can then be set in the devicetree statically and parsed by a
> +modern kernel without requiring a bootloader update.
> +
> +/ {
> +       chosen {
> +               random-seed,userspace = <0x40000 0x200>;
> +       };
> +};
> +
>  linux,booted-from-kexec
>  -----------------------
>

I'm a DT newbie still, so please ignore me if I'm not making useful comments. :)

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [kernel-hardening] Re: devicetree random-seed properties, was: "Re: [PATCH v7 0/9] x86/mm: memory area address KASLR"
@ 2016-06-24 19:04                     ` Kees Cook
  0 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-06-24 19:04 UTC (permalink / raw)
  To: Jason Cooper
  Cc: Ard Biesheuvel, Thomas Garnier, kernel-hardening, Ingo Molnar,
	Andy Lutomirski, x86, Borislav Petkov, Baoquan He, Yinghai Lu,
	Juergen Gross, Matt Fleming, Toshi Kani, Andrew Morton,
	Dan Williams, Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

On Fri, Jun 24, 2016 at 9:02 AM, Jason Cooper <jason@lakedaemon.net> wrote:
> Thomas,
>
> Sorry for wandering off the topic of your series.  The big take away for
> me is that you and Kees are concerned about x86 systems pre-RDRAND.
> Just as I'm concerned about deployed embedded systems without bootloader
> support for hw-rngs and so forth.
>
> Whatever final form the approach takes for ARM/dt, I'll make sure we can
> extend it to legacy x86 systems.

Yeah, this seems like a productive conversation to me. :)

> Ard,
>
> On Fri, Jun 24, 2016 at 12:54:01PM +0200, Ard Biesheuvel wrote:
>> On 24 June 2016 at 03:11, Jason Cooper <jason@lakedaemon.net> wrote:
>> > On Thu, Jun 23, 2016 at 10:05:53PM +0200, Ard Biesheuvel wrote:
> ...
>> >> On arm64, only DT is used for KASLR (even when booting via ACPI). My
>> >> first draft used register x1, but this turned out to be too much of a
>> >> hassle, since parsing the DT is also necessary to discover whether
>> >> there is a 'nokaslr' argument on the kernel command line. So the
>> >> current implementation only supports a single method, which is the
>> >> /chosen/kaslr-seed uint64 property.
>> >
>> > Ok, just to clarify (after a short offline chat), my goal is to set a
>> > userspace,random-seed <addr, len> property in the device tree once.
>> > The bootloader scripts would also only need to be altered once.
>> >
>> > Then, at each boot, the bootloader reads the entirety of
>> > /var/lib/misc/random-seed (512 bytes) into the configured address.
>> > random-seed could be in /boot, or on a flash partition.
>> >
>> > The decompressor would consume a small portion of that seed for kaslr
>> > and such.  After that, the rest would be consumed by random.c to
>> > initialize the entropy pools.
>> >
>>
>> I see. This indeed has little to do with the arm64 KASLR case, other
>> than that they both use a DT property.
>>
>> In the arm64 KASLR case, I deliberately chose to leave it up to the
>> bootloader/firmware to roll the dice, for the same reason you pointed
>> out, i.e., that there is no architected way on ARM to obtain random
>> bits. So in that sense, what you are doing is complimentary to my
>> work, and a KASLR  aware arm64 bootloader would copy some of its
>> random bits taken from /var/lib/misc/random-seed into the
>> /chosen/kaslr-seed DT property.
>
> Here I disagree.  We have two distinct entropy sources; the hw-rng
> currently feeding kaslr via the /chosen/kaslr-seed property, and the
> seasoned userspace seed I propose handed in via an extra property.
>
> Having the bootloader conflate those two sources as if they are equal
> seems to muddy the waters.  I prefer to have bootloaders tell me where
> they got the data rather than to hope the bootloader sourced and mixed
> it well.
>
>> Note that, at the moment, this DT property is only an internal
>> contract between the kernel's UEFI stub and the kernel proper, so we
>> could still easily change that if necessary.
>
> Ideally, I'd prefer to be deliberate with the DT properties, e.g.
>
> random-seed,hwrng     <--- bootloader reads from hw-rng
> random-seed,userspace <--- bootloader reads file from us to addr
>
> The kernel decompressor can init kaslr with only one of the two
> properties populated.  If both properties are present, then the
> decompressor can extract a u64 from userspace-seed and mix it with
> hwrng-seed before use.
>
> The small devicetree portion of my brain feels like 'kaslr-seed' is
> telling the OS what to do with the value.  Whereas devicetree is
> supposed to be describing the hardware.  Or, in this case, describing
> the source of the data.
>
> Given that more entropy from more sources is useful for random.c a bit
> later in the boot process, it might be worth making hwrng-seed larger
> than u64 as well.  This way we can potentially seed random.c from two
> sources *before* init even starts.  Without having to depend on the
> kernel's hw-rng driver being probed.  After all, it might not have been
> built, or it could be a module that's loaded later.
>
> I've attached a draft patch to chosen.txt.
>
> thx,
>
> Jason.
>
>
> --------------->8---------------------------------
> diff --git a/Documentation/devicetree/bindings/chosen.txt b/Documentation/devicetree/bindings/chosen.txt
> index 6ae9d82d4c37..61f15f04bc0a 100644
> --- a/Documentation/devicetree/bindings/chosen.txt
> +++ b/Documentation/devicetree/bindings/chosen.txt
> @@ -45,6 +45,52 @@ on PowerPC "stdout" if "stdout-path" is not found.  However, the
>  "linux,stdout-path" and "stdout" properties are deprecated. New platforms
>  should only use the "stdout-path" property.
>
> +random-seed properties
> +----------------------
> +
> +The goal of these properties are to provide an entropy seed early in the boot
> +process.  Typically, this is needed by the kernel decompressor for
> +initializing KASLR.  At that point, the kernel entropy pools haven't been
> +initialized yet, and any hardware rng drivers haven't been loaded yet, if they
> +exist.
> +
> +The bootloader can attain these seeds and pass them to the kernel via the
> +respective properties.  The bootloader is not expected to mix or condition
> +this data in any way, simply read and pass.  Either one or both properties can
> +be set if the data is available.
> +
> +random-seed,hwrng property
> +--------------------------
> +
> +For bootloaders with support for reading from the system's hardware random
> +number generator.  The bootloader can read a chunk of data from the hw-rng
> +and set it as the value for this binary blob property.

As in the boot loader would change the value per-boot?

Does this proposal include replacing /chosen/kaslr-seed with
random-seed,hwrng? (Should the "chosen" path be used for hwrng too?)

> +
> +/ {
> +       chosen {
> +               random-seed,hwrng = <0x1f 0x07 0x4d 0x91 ...>;
> +       };
> +};
> +
> +random-seed,userspace property
> +------------------------------
> +
> +The goal of this property is to also provide backwards compatibility with
> +existing systems.  The bootloaders on these deployed systems typically lack
> +the ability to edit a devicetree or read from an hwrng.  The only requirement
> +for a bootloader is that it be able to read a seed file generated by the
> +previous boot into a pre-determined physical address and size.  This is
> +typically done via boot scripting.

What happens on a cold boot?

> +
> +This property can then be set in the devicetree statically and parsed by a
> +modern kernel without requiring a bootloader update.
> +
> +/ {
> +       chosen {
> +               random-seed,userspace = <0x40000 0x200>;
> +       };
> +};
> +
>  linux,booted-from-kexec
>  -----------------------
>

I'm a DT newbie still, so please ignore me if I'm not making useful comments. :)

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: devicetree random-seed properties, was: "Re: [PATCH v7 0/9] x86/mm: memory area address KASLR"
  2016-06-24 19:04                     ` [kernel-hardening] " Kees Cook
@ 2016-06-24 20:40                       ` Andy Lutomirski
  -1 siblings, 0 replies; 74+ messages in thread
From: Andy Lutomirski @ 2016-06-24 20:40 UTC (permalink / raw)
  To: Kees Cook
  Cc: Jason Cooper, Ard Biesheuvel, Thomas Garnier, kernel-hardening,
	Ingo Molnar, Andy Lutomirski, x86, Borislav Petkov, Baoquan He,
	Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich, LKML,
	Jonathan Corbet, linux-doc

On Fri, Jun 24, 2016 at 12:04 PM, Kees Cook <keescook@chromium.org> wrote:
> On Fri, Jun 24, 2016 at 9:02 AM, Jason Cooper <jason@lakedaemon.net> wrote:
>> Thomas,
>>
>> Sorry for wandering off the topic of your series.  The big take away for
>> me is that you and Kees are concerned about x86 systems pre-RDRAND.
>> Just as I'm concerned about deployed embedded systems without bootloader
>> support for hw-rngs and so forth.
>>
>> Whatever final form the approach takes for ARM/dt, I'll make sure we can
>> extend it to legacy x86 systems.
>
> Yeah, this seems like a productive conversation to me. :)

I have an old patch and spec I need to dust off that does this during
*very* early boot on x86 using MSRs so that kASLR can use it.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [kernel-hardening] Re: devicetree random-seed properties, was: "Re: [PATCH v7 0/9] x86/mm: memory area address KASLR"
@ 2016-06-24 20:40                       ` Andy Lutomirski
  0 siblings, 0 replies; 74+ messages in thread
From: Andy Lutomirski @ 2016-06-24 20:40 UTC (permalink / raw)
  To: Kees Cook
  Cc: Jason Cooper, Ard Biesheuvel, Thomas Garnier, kernel-hardening,
	Ingo Molnar, Andy Lutomirski, x86, Borislav Petkov, Baoquan He,
	Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich, LKML,
	Jonathan Corbet, linux-doc

On Fri, Jun 24, 2016 at 12:04 PM, Kees Cook <keescook@chromium.org> wrote:
> On Fri, Jun 24, 2016 at 9:02 AM, Jason Cooper <jason@lakedaemon.net> wrote:
>> Thomas,
>>
>> Sorry for wandering off the topic of your series.  The big take away for
>> me is that you and Kees are concerned about x86 systems pre-RDRAND.
>> Just as I'm concerned about deployed embedded systems without bootloader
>> support for hw-rngs and so forth.
>>
>> Whatever final form the approach takes for ARM/dt, I'll make sure we can
>> extend it to legacy x86 systems.
>
> Yeah, this seems like a productive conversation to me. :)

I have an old patch and spec I need to dust off that does this during
*very* early boot on x86 using MSRs so that kASLR can use it.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: devicetree random-seed properties, was: "Re: [PATCH v7 0/9] x86/mm: memory area address KASLR"
  2016-06-24 19:04                     ` [kernel-hardening] " Kees Cook
@ 2016-06-30 21:48                       ` Jason Cooper
  -1 siblings, 0 replies; 74+ messages in thread
From: Jason Cooper @ 2016-06-30 21:48 UTC (permalink / raw)
  To: Kees Cook
  Cc: Ard Biesheuvel, Thomas Garnier, kernel-hardening, Ingo Molnar,
	Andy Lutomirski, x86, Borislav Petkov, Baoquan He, Yinghai Lu,
	Juergen Gross, Matt Fleming, Toshi Kani, Andrew Morton,
	Dan Williams, Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

Hi Kees,

On Fri, Jun 24, 2016 at 12:04:32PM -0700, Kees Cook wrote:
> On Fri, Jun 24, 2016 at 9:02 AM, Jason Cooper <jason@lakedaemon.net> wrote:
> > Thomas,
> >
> > Sorry for wandering off the topic of your series.  The big take away for
> > me is that you and Kees are concerned about x86 systems pre-RDRAND.
> > Just as I'm concerned about deployed embedded systems without bootloader
> > support for hw-rngs and so forth.
> >
> > Whatever final form the approach takes for ARM/dt, I'll make sure we can
> > extend it to legacy x86 systems.
> 
> Yeah, this seems like a productive conversation to me. :)
> 
> > Ard,
> >
> > On Fri, Jun 24, 2016 at 12:54:01PM +0200, Ard Biesheuvel wrote:
> >> On 24 June 2016 at 03:11, Jason Cooper <jason@lakedaemon.net> wrote:
> >> > On Thu, Jun 23, 2016 at 10:05:53PM +0200, Ard Biesheuvel wrote:
> > ...
> >> >> On arm64, only DT is used for KASLR (even when booting via ACPI). My
> >> >> first draft used register x1, but this turned out to be too much of a
> >> >> hassle, since parsing the DT is also necessary to discover whether
> >> >> there is a 'nokaslr' argument on the kernel command line. So the
> >> >> current implementation only supports a single method, which is the
> >> >> /chosen/kaslr-seed uint64 property.
> >> >
> >> > Ok, just to clarify (after a short offline chat), my goal is to set a
> >> > userspace,random-seed <addr, len> property in the device tree once.
> >> > The bootloader scripts would also only need to be altered once.
> >> >
> >> > Then, at each boot, the bootloader reads the entirety of
> >> > /var/lib/misc/random-seed (512 bytes) into the configured address.
> >> > random-seed could be in /boot, or on a flash partition.
> >> >
> >> > The decompressor would consume a small portion of that seed for kaslr
> >> > and such.  After that, the rest would be consumed by random.c to
> >> > initialize the entropy pools.
> >> >
> >>
> >> I see. This indeed has little to do with the arm64 KASLR case, other
> >> than that they both use a DT property.
> >>
> >> In the arm64 KASLR case, I deliberately chose to leave it up to the
> >> bootloader/firmware to roll the dice, for the same reason you pointed
> >> out, i.e., that there is no architected way on ARM to obtain random
> >> bits. So in that sense, what you are doing is complimentary to my
> >> work, and a KASLR  aware arm64 bootloader would copy some of its
> >> random bits taken from /var/lib/misc/random-seed into the
> >> /chosen/kaslr-seed DT property.
> >
> > Here I disagree.  We have two distinct entropy sources; the hw-rng
> > currently feeding kaslr via the /chosen/kaslr-seed property, and the
> > seasoned userspace seed I propose handed in via an extra property.
> >
> > Having the bootloader conflate those two sources as if they are equal
> > seems to muddy the waters.  I prefer to have bootloaders tell me where
> > they got the data rather than to hope the bootloader sourced and mixed
> > it well.
> >
> >> Note that, at the moment, this DT property is only an internal
> >> contract between the kernel's UEFI stub and the kernel proper, so we
> >> could still easily change that if necessary.
> >
> > Ideally, I'd prefer to be deliberate with the DT properties, e.g.
> >
> > random-seed,hwrng     <--- bootloader reads from hw-rng
> > random-seed,userspace <--- bootloader reads file from us to addr
> >
> > The kernel decompressor can init kaslr with only one of the two
> > properties populated.  If both properties are present, then the
> > decompressor can extract a u64 from userspace-seed and mix it with
> > hwrng-seed before use.
> >
> > The small devicetree portion of my brain feels like 'kaslr-seed' is
> > telling the OS what to do with the value.  Whereas devicetree is
> > supposed to be describing the hardware.  Or, in this case, describing
> > the source of the data.
> >
> > Given that more entropy from more sources is useful for random.c a bit
> > later in the boot process, it might be worth making hwrng-seed larger
> > than u64 as well.  This way we can potentially seed random.c from two
> > sources *before* init even starts.  Without having to depend on the
> > kernel's hw-rng driver being probed.  After all, it might not have been
> > built, or it could be a module that's loaded later.
> >
> > I've attached a draft patch to chosen.txt.
> >
> > thx,
> >
> > Jason.
> >
> >
> > --------------->8---------------------------------
> > diff --git a/Documentation/devicetree/bindings/chosen.txt b/Documentation/devicetree/bindings/chosen.txt
> > index 6ae9d82d4c37..61f15f04bc0a 100644
> > --- a/Documentation/devicetree/bindings/chosen.txt
> > +++ b/Documentation/devicetree/bindings/chosen.txt
> > @@ -45,6 +45,52 @@ on PowerPC "stdout" if "stdout-path" is not found.  However, the
> >  "linux,stdout-path" and "stdout" properties are deprecated. New platforms
> >  should only use the "stdout-path" property.
> >
> > +random-seed properties
> > +----------------------
> > +
> > +The goal of these properties are to provide an entropy seed early in the boot
> > +process.  Typically, this is needed by the kernel decompressor for
> > +initializing KASLR.  At that point, the kernel entropy pools haven't been
> > +initialized yet, and any hardware rng drivers haven't been loaded yet, if they
> > +exist.
> > +
> > +The bootloader can attain these seeds and pass them to the kernel via the
> > +respective properties.  The bootloader is not expected to mix or condition
> > +this data in any way, simply read and pass.  Either one or both properties can
> > +be set if the data is available.
> > +
> > +random-seed,hwrng property
> > +--------------------------
> > +
> > +For bootloaders with support for reading from the system's hardware random
> > +number generator.  The bootloader can read a chunk of data from the hw-rng
> > +and set it as the value for this binary blob property.
> 
> As in the boot loader would change the value per-boot?

Yes-ish.  It's an opaque binary blob to the bootloader, but it does update
the devicetree at each boot.

This differs from the userspace approach because bootloaders supporting
devicetree (passing and updating) pre-date bootloader drivers for rngs.
So, if it can read from the hw rng, then it almost certainly can update
the devicetree.  Which is the preferred method for passing data in
devicetree world.

> Does this proposal include replacing /chosen/kaslr-seed with
> random-seed,hwrng? (Should the "chosen" path be used for hwrng too?)

Well, that's up to Ard.  ;-)  I'm simply trying to put my thoughts into
concrete terms for consideration.  If Ard and others are amenable to it,
then yes.  A kernel supporting reading seeds from these proposed DT
properties would not need kaslr-seed.

My objection to /chosen/kaslr-seed is that it doesn't follow the mantra
of DT; state what the object is, not how you think the OS should use it.
Second, by only feeding in enough entropy for seeding kaslr, we are
missing the opportunity to provide sufficient entropy for seeding the
system entropy pools before init is called.

If we need to add this code to support kaslr seeding (we do), then we
may as well solve initializing the system entropy pools while we are
here.  The increased maintenance burden should be negligible.

> > +
> > +/ {
> > +       chosen {
> > +               random-seed,hwrng = <0x1f 0x07 0x4d 0x91 ...>;
> > +       };
> > +};
> > +
> > +random-seed,userspace property
> > +------------------------------
> > +
> > +The goal of this property is to also provide backwards compatibility with
> > +existing systems.  The bootloaders on these deployed systems typically lack
> > +the ability to edit a devicetree or read from an hwrng.  The only requirement
> > +for a bootloader is that it be able to read a seed file generated by the
> > +previous boot into a pre-determined physical address and size.  This is
> > +typically done via boot scripting.
> 
> What happens on a cold boot?

Nothing different.  As long as the OS wrote a new seed blob to the known
location in the filesystem (traditionally /var/lib/misc/random-seed),
then the bootloader should read it in to RAM and then proceed with the
normal boot process.

I resisted calling out /var/lib/misc/random-seed specifically because a
lot of older bootloaders only have support for reading from FAT32 or,
worst case, flash.

In the FAT32-only scenario, most OSes will put the kernel and initrd on
a separate FAT32 partition (usually /boot) for the bootloader to read
from.  So, the bootloader can read /boot/random-seed into the RAM
address before executing the kernel.  /var/lib/misc/random-seed can be a
symlink to /boot, or the init scripts can be modified to write to both
locations.

> > +
> > +This property can then be set in the devicetree statically and parsed by a
> > +modern kernel without requiring a bootloader update.
> > +
> > +/ {
> > +       chosen {
> > +               random-seed,userspace = <0x40000 0x200>;
> > +       };
> > +};
> > +
> >  linux,booted-from-kexec
> >  -----------------------
> >

thx,

Jason.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [kernel-hardening] Re: devicetree random-seed properties, was: "Re: [PATCH v7 0/9] x86/mm: memory area address KASLR"
@ 2016-06-30 21:48                       ` Jason Cooper
  0 siblings, 0 replies; 74+ messages in thread
From: Jason Cooper @ 2016-06-30 21:48 UTC (permalink / raw)
  To: Kees Cook
  Cc: Ard Biesheuvel, Thomas Garnier, kernel-hardening, Ingo Molnar,
	Andy Lutomirski, x86, Borislav Petkov, Baoquan He, Yinghai Lu,
	Juergen Gross, Matt Fleming, Toshi Kani, Andrew Morton,
	Dan Williams, Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

Hi Kees,

On Fri, Jun 24, 2016 at 12:04:32PM -0700, Kees Cook wrote:
> On Fri, Jun 24, 2016 at 9:02 AM, Jason Cooper <jason@lakedaemon.net> wrote:
> > Thomas,
> >
> > Sorry for wandering off the topic of your series.  The big take away for
> > me is that you and Kees are concerned about x86 systems pre-RDRAND.
> > Just as I'm concerned about deployed embedded systems without bootloader
> > support for hw-rngs and so forth.
> >
> > Whatever final form the approach takes for ARM/dt, I'll make sure we can
> > extend it to legacy x86 systems.
> 
> Yeah, this seems like a productive conversation to me. :)
> 
> > Ard,
> >
> > On Fri, Jun 24, 2016 at 12:54:01PM +0200, Ard Biesheuvel wrote:
> >> On 24 June 2016 at 03:11, Jason Cooper <jason@lakedaemon.net> wrote:
> >> > On Thu, Jun 23, 2016 at 10:05:53PM +0200, Ard Biesheuvel wrote:
> > ...
> >> >> On arm64, only DT is used for KASLR (even when booting via ACPI). My
> >> >> first draft used register x1, but this turned out to be too much of a
> >> >> hassle, since parsing the DT is also necessary to discover whether
> >> >> there is a 'nokaslr' argument on the kernel command line. So the
> >> >> current implementation only supports a single method, which is the
> >> >> /chosen/kaslr-seed uint64 property.
> >> >
> >> > Ok, just to clarify (after a short offline chat), my goal is to set a
> >> > userspace,random-seed <addr, len> property in the device tree once.
> >> > The bootloader scripts would also only need to be altered once.
> >> >
> >> > Then, at each boot, the bootloader reads the entirety of
> >> > /var/lib/misc/random-seed (512 bytes) into the configured address.
> >> > random-seed could be in /boot, or on a flash partition.
> >> >
> >> > The decompressor would consume a small portion of that seed for kaslr
> >> > and such.  After that, the rest would be consumed by random.c to
> >> > initialize the entropy pools.
> >> >
> >>
> >> I see. This indeed has little to do with the arm64 KASLR case, other
> >> than that they both use a DT property.
> >>
> >> In the arm64 KASLR case, I deliberately chose to leave it up to the
> >> bootloader/firmware to roll the dice, for the same reason you pointed
> >> out, i.e., that there is no architected way on ARM to obtain random
> >> bits. So in that sense, what you are doing is complimentary to my
> >> work, and a KASLR  aware arm64 bootloader would copy some of its
> >> random bits taken from /var/lib/misc/random-seed into the
> >> /chosen/kaslr-seed DT property.
> >
> > Here I disagree.  We have two distinct entropy sources; the hw-rng
> > currently feeding kaslr via the /chosen/kaslr-seed property, and the
> > seasoned userspace seed I propose handed in via an extra property.
> >
> > Having the bootloader conflate those two sources as if they are equal
> > seems to muddy the waters.  I prefer to have bootloaders tell me where
> > they got the data rather than to hope the bootloader sourced and mixed
> > it well.
> >
> >> Note that, at the moment, this DT property is only an internal
> >> contract between the kernel's UEFI stub and the kernel proper, so we
> >> could still easily change that if necessary.
> >
> > Ideally, I'd prefer to be deliberate with the DT properties, e.g.
> >
> > random-seed,hwrng     <--- bootloader reads from hw-rng
> > random-seed,userspace <--- bootloader reads file from us to addr
> >
> > The kernel decompressor can init kaslr with only one of the two
> > properties populated.  If both properties are present, then the
> > decompressor can extract a u64 from userspace-seed and mix it with
> > hwrng-seed before use.
> >
> > The small devicetree portion of my brain feels like 'kaslr-seed' is
> > telling the OS what to do with the value.  Whereas devicetree is
> > supposed to be describing the hardware.  Or, in this case, describing
> > the source of the data.
> >
> > Given that more entropy from more sources is useful for random.c a bit
> > later in the boot process, it might be worth making hwrng-seed larger
> > than u64 as well.  This way we can potentially seed random.c from two
> > sources *before* init even starts.  Without having to depend on the
> > kernel's hw-rng driver being probed.  After all, it might not have been
> > built, or it could be a module that's loaded later.
> >
> > I've attached a draft patch to chosen.txt.
> >
> > thx,
> >
> > Jason.
> >
> >
> > --------------->8---------------------------------
> > diff --git a/Documentation/devicetree/bindings/chosen.txt b/Documentation/devicetree/bindings/chosen.txt
> > index 6ae9d82d4c37..61f15f04bc0a 100644
> > --- a/Documentation/devicetree/bindings/chosen.txt
> > +++ b/Documentation/devicetree/bindings/chosen.txt
> > @@ -45,6 +45,52 @@ on PowerPC "stdout" if "stdout-path" is not found.  However, the
> >  "linux,stdout-path" and "stdout" properties are deprecated. New platforms
> >  should only use the "stdout-path" property.
> >
> > +random-seed properties
> > +----------------------
> > +
> > +The goal of these properties are to provide an entropy seed early in the boot
> > +process.  Typically, this is needed by the kernel decompressor for
> > +initializing KASLR.  At that point, the kernel entropy pools haven't been
> > +initialized yet, and any hardware rng drivers haven't been loaded yet, if they
> > +exist.
> > +
> > +The bootloader can attain these seeds and pass them to the kernel via the
> > +respective properties.  The bootloader is not expected to mix or condition
> > +this data in any way, simply read and pass.  Either one or both properties can
> > +be set if the data is available.
> > +
> > +random-seed,hwrng property
> > +--------------------------
> > +
> > +For bootloaders with support for reading from the system's hardware random
> > +number generator.  The bootloader can read a chunk of data from the hw-rng
> > +and set it as the value for this binary blob property.
> 
> As in the boot loader would change the value per-boot?

Yes-ish.  It's an opaque binary blob to the bootloader, but it does update
the devicetree at each boot.

This differs from the userspace approach because bootloaders supporting
devicetree (passing and updating) pre-date bootloader drivers for rngs.
So, if it can read from the hw rng, then it almost certainly can update
the devicetree.  Which is the preferred method for passing data in
devicetree world.

> Does this proposal include replacing /chosen/kaslr-seed with
> random-seed,hwrng? (Should the "chosen" path be used for hwrng too?)

Well, that's up to Ard.  ;-)  I'm simply trying to put my thoughts into
concrete terms for consideration.  If Ard and others are amenable to it,
then yes.  A kernel supporting reading seeds from these proposed DT
properties would not need kaslr-seed.

My objection to /chosen/kaslr-seed is that it doesn't follow the mantra
of DT; state what the object is, not how you think the OS should use it.
Second, by only feeding in enough entropy for seeding kaslr, we are
missing the opportunity to provide sufficient entropy for seeding the
system entropy pools before init is called.

If we need to add this code to support kaslr seeding (we do), then we
may as well solve initializing the system entropy pools while we are
here.  The increased maintenance burden should be negligible.

> > +
> > +/ {
> > +       chosen {
> > +               random-seed,hwrng = <0x1f 0x07 0x4d 0x91 ...>;
> > +       };
> > +};
> > +
> > +random-seed,userspace property
> > +------------------------------
> > +
> > +The goal of this property is to also provide backwards compatibility with
> > +existing systems.  The bootloaders on these deployed systems typically lack
> > +the ability to edit a devicetree or read from an hwrng.  The only requirement
> > +for a bootloader is that it be able to read a seed file generated by the
> > +previous boot into a pre-determined physical address and size.  This is
> > +typically done via boot scripting.
> 
> What happens on a cold boot?

Nothing different.  As long as the OS wrote a new seed blob to the known
location in the filesystem (traditionally /var/lib/misc/random-seed),
then the bootloader should read it in to RAM and then proceed with the
normal boot process.

I resisted calling out /var/lib/misc/random-seed specifically because a
lot of older bootloaders only have support for reading from FAT32 or,
worst case, flash.

In the FAT32-only scenario, most OSes will put the kernel and initrd on
a separate FAT32 partition (usually /boot) for the bootloader to read
from.  So, the bootloader can read /boot/random-seed into the RAM
address before executing the kernel.  /var/lib/misc/random-seed can be a
symlink to /boot, or the init scripts can be modified to write to both
locations.

> > +
> > +This property can then be set in the devicetree statically and parsed by a
> > +modern kernel without requiring a bootloader update.
> > +
> > +/ {
> > +       chosen {
> > +               random-seed,userspace = <0x40000 0x200>;
> > +       };
> > +};
> > +
> >  linux,booted-from-kexec
> >  -----------------------
> >

thx,

Jason.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: devicetree random-seed properties, was: "Re: [PATCH v7 0/9] x86/mm: memory area address KASLR"
  2016-06-24 20:40                       ` [kernel-hardening] " Andy Lutomirski
@ 2016-06-30 21:48                         ` Jason Cooper
  -1 siblings, 0 replies; 74+ messages in thread
From: Jason Cooper @ 2016-06-30 21:48 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Kees Cook, Ard Biesheuvel, Thomas Garnier, kernel-hardening,
	Ingo Molnar, Andy Lutomirski, x86, Borislav Petkov, Baoquan He,
	Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich, LKML,
	Jonathan Corbet, linux-doc

On Fri, Jun 24, 2016 at 01:40:41PM -0700, Andy Lutomirski wrote:
> On Fri, Jun 24, 2016 at 12:04 PM, Kees Cook <keescook@chromium.org> wrote:
> > On Fri, Jun 24, 2016 at 9:02 AM, Jason Cooper <jason@lakedaemon.net> wrote:
> >> Thomas,
> >>
> >> Sorry for wandering off the topic of your series.  The big take away for
> >> me is that you and Kees are concerned about x86 systems pre-RDRAND.
> >> Just as I'm concerned about deployed embedded systems without bootloader
> >> support for hw-rngs and so forth.
> >>
> >> Whatever final form the approach takes for ARM/dt, I'll make sure we can
> >> extend it to legacy x86 systems.
> >
> > Yeah, this seems like a productive conversation to me. :)
> 
> I have an old patch and spec I need to dust off that does this during
> *very* early boot on x86 using MSRs so that kASLR can use it.

I'd love to see that. ;-)

thx,

Jason.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [kernel-hardening] Re: devicetree random-seed properties, was: "Re: [PATCH v7 0/9] x86/mm: memory area address KASLR"
@ 2016-06-30 21:48                         ` Jason Cooper
  0 siblings, 0 replies; 74+ messages in thread
From: Jason Cooper @ 2016-06-30 21:48 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Kees Cook, Ard Biesheuvel, Thomas Garnier, kernel-hardening,
	Ingo Molnar, Andy Lutomirski, x86, Borislav Petkov, Baoquan He,
	Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich, LKML,
	Jonathan Corbet, linux-doc

On Fri, Jun 24, 2016 at 01:40:41PM -0700, Andy Lutomirski wrote:
> On Fri, Jun 24, 2016 at 12:04 PM, Kees Cook <keescook@chromium.org> wrote:
> > On Fri, Jun 24, 2016 at 9:02 AM, Jason Cooper <jason@lakedaemon.net> wrote:
> >> Thomas,
> >>
> >> Sorry for wandering off the topic of your series.  The big take away for
> >> me is that you and Kees are concerned about x86 systems pre-RDRAND.
> >> Just as I'm concerned about deployed embedded systems without bootloader
> >> support for hw-rngs and so forth.
> >>
> >> Whatever final form the approach takes for ARM/dt, I'll make sure we can
> >> extend it to legacy x86 systems.
> >
> > Yeah, this seems like a productive conversation to me. :)
> 
> I have an old patch and spec I need to dust off that does this during
> *very* early boot on x86 using MSRs so that kASLR can use it.

I'd love to see that. ;-)

thx,

Jason.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [kernel-hardening] Re: devicetree random-seed properties, was: "Re: [PATCH v7 0/9] x86/mm: memory area address KASLR"
  2016-06-30 21:48                         ` [kernel-hardening] " Jason Cooper
  (?)
@ 2016-06-30 21:56                         ` Thomas Garnier
  -1 siblings, 0 replies; 74+ messages in thread
From: Thomas Garnier @ 2016-06-30 21:56 UTC (permalink / raw)
  To: Jason Cooper, Andy Lutomirski
  Cc: Kees Cook, Ard Biesheuvel, kernel-hardening, Ingo Molnar,
	Andy Lutomirski, x86, Borislav Petkov, Baoquan He, Yinghai Lu,
	Juergen Gross, Matt Fleming, Toshi Kani, Andrew Morton,
	Dan Williams, Kirill A. Shutemov, Dave Hansen, Xiao Guangrong,
	Martin Schwidefsky, Aneesh Kumar K.V, Alexander Kuleshov,
	Alexander Popov, Dave Young, Joerg Roedel, Lv Zheng, Mark Salter,
	Dmitry Vyukov, Stephen Smalley, Boris Ostrovsky,
	Christian Borntraeger, Jan Beulich, LKML, Jonathan Corbet,
	linux-doc

[-- Attachment #1: Type: text/plain, Size: 1028 bytes --]

So would I!

On Thu, Jun 30, 2016, 2:49 PM Jason Cooper <jason@lakedaemon.net> wrote:

> On Fri, Jun 24, 2016 at 01:40:41PM -0700, Andy Lutomirski wrote:
> > On Fri, Jun 24, 2016 at 12:04 PM, Kees Cook <keescook@chromium.org>
> wrote:
> > > On Fri, Jun 24, 2016 at 9:02 AM, Jason Cooper <jason@lakedaemon.net>
> wrote:
> > >> Thomas,
> > >>
> > >> Sorry for wandering off the topic of your series.  The big take away
> for
> > >> me is that you and Kees are concerned about x86 systems pre-RDRAND.
> > >> Just as I'm concerned about deployed embedded systems without
> bootloader
> > >> support for hw-rngs and so forth.
> > >>
> > >> Whatever final form the approach takes for ARM/dt, I'll make sure we
> can
> > >> extend it to legacy x86 systems.
> > >
> > > Yeah, this seems like a productive conversation to me. :)
> >
> > I have an old patch and spec I need to dust off that does this during
> > *very* early boot on x86 using MSRs so that kASLR can use it.
>
> I'd love to see that. ;-)
>
> thx,
>
> Jason.
>
-- 

Thomas

[-- Attachment #2: Type: text/html, Size: 1666 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v7 0/9] x86/mm: memory area address KASLR
  2016-06-22  0:46 ` [kernel-hardening] " Kees Cook
@ 2016-07-07 22:24   ` Kees Cook
  -1 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-07-07 22:24 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich, LKML,
	Jonathan Corbet, linux-doc, kernel-hardening

On Tue, Jun 21, 2016 at 8:46 PM, Kees Cook <keescook@chromium.org> wrote:
> This is v7 of Thomas Garnier's KASLR for memory areas (physical memory
> mapping, vmalloc, vmemmap). It expects to be applied on top of the
> x86/boot tip.
>
> The current implementation of KASLR randomizes only the base address of
> the kernel and its modules. Research was published showing that static
> memory addresses can be found and used in exploits, effectively ignoring
> base address KASLR:
>
>    The physical memory mapping holds most allocations from boot and
>    heap allocators. Knowning the base address and physical memory
>    size, an attacker can deduce the PDE virtual address for the vDSO
>    memory page.  This attack was demonstrated at CanSecWest 2016, in
>    the "Getting Physical: Extreme Abuse of Intel Based Paged Systems"
>    https://goo.gl/ANpWdV (see second part of the presentation). The
>    exploits used against Linux worked successfuly against 4.6+ but fail
>    with KASLR memory enabled (https://goo.gl/iTtXMJ). Similar research
>    was done at Google leading to this patch proposal. Variants exists
>    to overwrite /proc or /sys objects ACLs leading to elevation of
>    privileges.  These variants were tested against 4.6+.
>
> This set of patches randomizes the base address and padding of three
> major memory sections (physical memory mapping, vmalloc, and vmemmap).
> It mitigates exploits relying on predictable kernel addresses in these
> areas. This feature can be enabled with the CONFIG_RANDOMIZE_MEMORY
> option. (This CONFIG, along with CONFIG_RANDOMIZE may be renamed in
> the future, but stands for now as other architectures continue to
> implement KASLR.)
>
> Padding for the memory hotplug support is managed by
> CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING. The default value is 10
> terabytes.
>
> The patches were tested on qemu & physical machines. Xen compatibility was
> also verified. Multiple reboots were used to verify entropy for each
> memory section.
>
> Notable problems that needed solving:
>  - The three target memory sections need to not be at the same place
>    across reboots.
>  - The physical memory mapping can use a virtual address not aligned on
>    the PGD page table.
>  - Reasonable entropy is needed early at boot before get_random_bytes()
>    is available.
>  - Memory hotplug needs KASLR padding.
>
> Patches:
>  - 1: refactor KASLR functions (moves them from boot/compressed/ into lib/)
>  - 2: clarifies the variables used for physical mapping.
>  - 3: PUD virtual address support for physical mapping.
>  - 4: split out the trampoline PGD
>  - 5: KASLR memory infrastructure code
>  - 6: randomize base of physical mapping region
>  - 7: randomize base of vmalloc region
>  - 8: randomize base of vmemmap region
>  - 9: provide memory hotplug padding support
>
> There is no measurable performance impact:
>
>  - Kernbench shows almost no difference (-+ less than 1%).
>  - Hackbench shows 0% difference on average (hackbench 90 repeated 10 times).

Hi again,

Just a friendly ping -- I'd love to get this into -tip for wider testing.

Thanks!

-Kees


-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [kernel-hardening] Re: [PATCH v7 0/9] x86/mm: memory area address KASLR
@ 2016-07-07 22:24   ` Kees Cook
  0 siblings, 0 replies; 74+ messages in thread
From: Kees Cook @ 2016-07-07 22:24 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Garnier, Andy Lutomirski, x86, Borislav Petkov,
	Baoquan He, Yinghai Lu, Juergen Gross, Matt Fleming, Toshi Kani,
	Andrew Morton, Dan Williams, Kirill A. Shutemov, Dave Hansen,
	Xiao Guangrong, Martin Schwidefsky, Aneesh Kumar K.V,
	Alexander Kuleshov, Alexander Popov, Dave Young, Joerg Roedel,
	Lv Zheng, Mark Salter, Dmitry Vyukov, Stephen Smalley,
	Boris Ostrovsky, Christian Borntraeger, Jan Beulich, LKML,
	Jonathan Corbet, linux-doc, kernel-hardening

On Tue, Jun 21, 2016 at 8:46 PM, Kees Cook <keescook@chromium.org> wrote:
> This is v7 of Thomas Garnier's KASLR for memory areas (physical memory
> mapping, vmalloc, vmemmap). It expects to be applied on top of the
> x86/boot tip.
>
> The current implementation of KASLR randomizes only the base address of
> the kernel and its modules. Research was published showing that static
> memory addresses can be found and used in exploits, effectively ignoring
> base address KASLR:
>
>    The physical memory mapping holds most allocations from boot and
>    heap allocators. Knowning the base address and physical memory
>    size, an attacker can deduce the PDE virtual address for the vDSO
>    memory page.  This attack was demonstrated at CanSecWest 2016, in
>    the "Getting Physical: Extreme Abuse of Intel Based Paged Systems"
>    https://goo.gl/ANpWdV (see second part of the presentation). The
>    exploits used against Linux worked successfuly against 4.6+ but fail
>    with KASLR memory enabled (https://goo.gl/iTtXMJ). Similar research
>    was done at Google leading to this patch proposal. Variants exists
>    to overwrite /proc or /sys objects ACLs leading to elevation of
>    privileges.  These variants were tested against 4.6+.
>
> This set of patches randomizes the base address and padding of three
> major memory sections (physical memory mapping, vmalloc, and vmemmap).
> It mitigates exploits relying on predictable kernel addresses in these
> areas. This feature can be enabled with the CONFIG_RANDOMIZE_MEMORY
> option. (This CONFIG, along with CONFIG_RANDOMIZE may be renamed in
> the future, but stands for now as other architectures continue to
> implement KASLR.)
>
> Padding for the memory hotplug support is managed by
> CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING. The default value is 10
> terabytes.
>
> The patches were tested on qemu & physical machines. Xen compatibility was
> also verified. Multiple reboots were used to verify entropy for each
> memory section.
>
> Notable problems that needed solving:
>  - The three target memory sections need to not be at the same place
>    across reboots.
>  - The physical memory mapping can use a virtual address not aligned on
>    the PGD page table.
>  - Reasonable entropy is needed early at boot before get_random_bytes()
>    is available.
>  - Memory hotplug needs KASLR padding.
>
> Patches:
>  - 1: refactor KASLR functions (moves them from boot/compressed/ into lib/)
>  - 2: clarifies the variables used for physical mapping.
>  - 3: PUD virtual address support for physical mapping.
>  - 4: split out the trampoline PGD
>  - 5: KASLR memory infrastructure code
>  - 6: randomize base of physical mapping region
>  - 7: randomize base of vmalloc region
>  - 8: randomize base of vmemmap region
>  - 9: provide memory hotplug padding support
>
> There is no measurable performance impact:
>
>  - Kernbench shows almost no difference (-+ less than 1%).
>  - Hackbench shows 0% difference on average (hackbench 90 repeated 10 times).

Hi again,

Just a friendly ping -- I'd love to get this into -tip for wider testing.

Thanks!

-Kees


-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [tip:x86/boot] x86/mm: Refactor KASLR entropy functions
  2016-06-22  0:46   ` [kernel-hardening] " Kees Cook
  (?)
@ 2016-07-08 20:33   ` tip-bot for Thomas Garnier
  -1 siblings, 0 replies; 74+ messages in thread
From: tip-bot for Thomas Garnier @ 2016-07-08 20:33 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: guangrong.xiao, borntraeger, yinghai, dave.hansen, bp, dyoung,
	dan.j.williams, dvlasenk, matt, jroedel, linux-kernel,
	kirill.shutemov, luto, aneesh.kumar, jgross, lv.zheng, torvalds,
	JBeulich, akpm, jpoimboe, keescook, brgerst, corbet, alpopov,
	boris.ostrovsky, bhe, hpa, bp, dvyukov, peterz, tglx, msalter,
	toshi.kani, mingo, sds, thgarnie, kuleshovmail, schwidefsky

Commit-ID:  d899a7d146a2ed8a7e6c2f61bcd232908bcbaabc
Gitweb:     http://git.kernel.org/tip/d899a7d146a2ed8a7e6c2f61bcd232908bcbaabc
Author:     Thomas Garnier <thgarnie@google.com>
AuthorDate: Tue, 21 Jun 2016 17:46:58 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 8 Jul 2016 17:33:45 +0200

x86/mm: Refactor KASLR entropy functions

Move the KASLR entropy functions into arch/x86/lib to be used in early
kernel boot for KASLR memory randomization.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Cc: Alexander Popov <alpopov@ptsecurity.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lv Zheng <lv.zheng@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-doc@vger.kernel.org
Link: http://lkml.kernel.org/r/1466556426-32664-2-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/boot/compressed/kaslr.c | 76 +++------------------------------
 arch/x86/include/asm/kaslr.h     |  6 +++
 arch/x86/lib/Makefile            |  1 +
 arch/x86/lib/kaslr.c             | 90 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 102 insertions(+), 71 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 010ea16..a66854d 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -12,10 +12,6 @@
 #include "misc.h"
 #include "error.h"
 
-#include <asm/msr.h>
-#include <asm/archrandom.h>
-#include <asm/e820.h>
-
 #include <generated/compile.h>
 #include <linux/module.h>
 #include <linux/uts.h>
@@ -26,26 +22,6 @@
 static const char build_str[] = UTS_RELEASE " (" LINUX_COMPILE_BY "@"
 		LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION;
 
-#define I8254_PORT_CONTROL	0x43
-#define I8254_PORT_COUNTER0	0x40
-#define I8254_CMD_READBACK	0xC0
-#define I8254_SELECT_COUNTER0	0x02
-#define I8254_STATUS_NOTREADY	0x40
-static inline u16 i8254(void)
-{
-	u16 status, timer;
-
-	do {
-		outb(I8254_PORT_CONTROL,
-		     I8254_CMD_READBACK | I8254_SELECT_COUNTER0);
-		status = inb(I8254_PORT_COUNTER0);
-		timer  = inb(I8254_PORT_COUNTER0);
-		timer |= inb(I8254_PORT_COUNTER0) << 8;
-	} while (status & I8254_STATUS_NOTREADY);
-
-	return timer;
-}
-
 static unsigned long rotate_xor(unsigned long hash, const void *area,
 				size_t size)
 {
@@ -62,7 +38,7 @@ static unsigned long rotate_xor(unsigned long hash, const void *area,
 }
 
 /* Attempt to create a simple but unpredictable starting entropy. */
-static unsigned long get_random_boot(void)
+static unsigned long get_boot_seed(void)
 {
 	unsigned long hash = 0;
 
@@ -72,50 +48,8 @@ static unsigned long get_random_boot(void)
 	return hash;
 }
 
-static unsigned long get_random_long(const char *purpose)
-{
-#ifdef CONFIG_X86_64
-	const unsigned long mix_const = 0x5d6008cbf3848dd3UL;
-#else
-	const unsigned long mix_const = 0x3f39e593UL;
-#endif
-	unsigned long raw, random = get_random_boot();
-	bool use_i8254 = true;
-
-	debug_putstr(purpose);
-	debug_putstr(" KASLR using");
-
-	if (has_cpuflag(X86_FEATURE_RDRAND)) {
-		debug_putstr(" RDRAND");
-		if (rdrand_long(&raw)) {
-			random ^= raw;
-			use_i8254 = false;
-		}
-	}
-
-	if (has_cpuflag(X86_FEATURE_TSC)) {
-		debug_putstr(" RDTSC");
-		raw = rdtsc();
-
-		random ^= raw;
-		use_i8254 = false;
-	}
-
-	if (use_i8254) {
-		debug_putstr(" i8254");
-		random ^= i8254();
-	}
-
-	/* Circular multiply for better bit diffusion */
-	asm("mul %3"
-	    : "=a" (random), "=d" (raw)
-	    : "a" (random), "rm" (mix_const));
-	random += raw;
-
-	debug_putstr("...\n");
-
-	return random;
-}
+#define KASLR_COMPRESSED_BOOT
+#include "../../lib/kaslr.c"
 
 struct mem_vector {
 	unsigned long start;
@@ -349,7 +283,7 @@ static unsigned long slots_fetch_random(void)
 	if (slot_max == 0)
 		return 0;
 
-	slot = get_random_long("Physical") % slot_max;
+	slot = kaslr_get_random_long("Physical") % slot_max;
 
 	for (i = 0; i < slot_area_index; i++) {
 		if (slot >= slot_areas[i].num) {
@@ -479,7 +413,7 @@ static unsigned long find_random_virt_addr(unsigned long minimum,
 	slots = (KERNEL_IMAGE_SIZE - minimum - image_size) /
 		 CONFIG_PHYSICAL_ALIGN + 1;
 
-	random_addr = get_random_long("Virtual") % slots;
+	random_addr = kaslr_get_random_long("Virtual") % slots;
 
 	return random_addr * CONFIG_PHYSICAL_ALIGN + minimum;
 }
diff --git a/arch/x86/include/asm/kaslr.h b/arch/x86/include/asm/kaslr.h
new file mode 100644
index 0000000..5547438
--- /dev/null
+++ b/arch/x86/include/asm/kaslr.h
@@ -0,0 +1,6 @@
+#ifndef _ASM_KASLR_H_
+#define _ASM_KASLR_H_
+
+unsigned long kaslr_get_random_long(const char *purpose);
+
+#endif
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index 72a5767..cfa6d07 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -24,6 +24,7 @@ lib-y += usercopy_$(BITS).o usercopy.o getuser.o putuser.o
 lib-y += memcpy_$(BITS).o
 lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
 lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o
+lib-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
 
 obj-y += msr.o msr-reg.o msr-reg-export.o
 
diff --git a/arch/x86/lib/kaslr.c b/arch/x86/lib/kaslr.c
new file mode 100644
index 0000000..f7dfeda
--- /dev/null
+++ b/arch/x86/lib/kaslr.c
@@ -0,0 +1,90 @@
+/*
+ * Entropy functions used on early boot for KASLR base and memory
+ * randomization. The base randomization is done in the compressed
+ * kernel and memory randomization is done early when the regular
+ * kernel starts. This file is included in the compressed kernel and
+ * normally linked in the regular.
+ */
+#include <asm/kaslr.h>
+#include <asm/msr.h>
+#include <asm/archrandom.h>
+#include <asm/e820.h>
+#include <asm/io.h>
+
+/*
+ * When built for the regular kernel, several functions need to be stubbed out
+ * or changed to their regular kernel equivalent.
+ */
+#ifndef KASLR_COMPRESSED_BOOT
+#include <asm/cpufeature.h>
+#include <asm/setup.h>
+
+#define debug_putstr(v) early_printk(v)
+#define has_cpuflag(f) boot_cpu_has(f)
+#define get_boot_seed() kaslr_offset()
+#endif
+
+#define I8254_PORT_CONTROL	0x43
+#define I8254_PORT_COUNTER0	0x40
+#define I8254_CMD_READBACK	0xC0
+#define I8254_SELECT_COUNTER0	0x02
+#define I8254_STATUS_NOTREADY	0x40
+static inline u16 i8254(void)
+{
+	u16 status, timer;
+
+	do {
+		outb(I8254_PORT_CONTROL,
+		     I8254_CMD_READBACK | I8254_SELECT_COUNTER0);
+		status = inb(I8254_PORT_COUNTER0);
+		timer  = inb(I8254_PORT_COUNTER0);
+		timer |= inb(I8254_PORT_COUNTER0) << 8;
+	} while (status & I8254_STATUS_NOTREADY);
+
+	return timer;
+}
+
+unsigned long kaslr_get_random_long(const char *purpose)
+{
+#ifdef CONFIG_X86_64
+	const unsigned long mix_const = 0x5d6008cbf3848dd3UL;
+#else
+	const unsigned long mix_const = 0x3f39e593UL;
+#endif
+	unsigned long raw, random = get_boot_seed();
+	bool use_i8254 = true;
+
+	debug_putstr(purpose);
+	debug_putstr(" KASLR using");
+
+	if (has_cpuflag(X86_FEATURE_RDRAND)) {
+		debug_putstr(" RDRAND");
+		if (rdrand_long(&raw)) {
+			random ^= raw;
+			use_i8254 = false;
+		}
+	}
+
+	if (has_cpuflag(X86_FEATURE_TSC)) {
+		debug_putstr(" RDTSC");
+		raw = rdtsc();
+
+		random ^= raw;
+		use_i8254 = false;
+	}
+
+	if (use_i8254) {
+		debug_putstr(" i8254");
+		random ^= i8254();
+	}
+
+	/* Circular multiply for better bit diffusion */
+	asm("mul %3"
+	    : "=a" (random), "=d" (raw)
+	    : "a" (random), "rm" (mix_const));
+	random += raw;
+
+	debug_putstr("...\n");
+
+	return random;
+}

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip:x86/boot] x86/mm: Update physical mapping variable names
  2016-06-22  0:46   ` [kernel-hardening] " Kees Cook
  (?)
@ 2016-07-08 20:34   ` tip-bot for Thomas Garnier
  -1 siblings, 0 replies; 74+ messages in thread
From: tip-bot for Thomas Garnier @ 2016-07-08 20:34 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: guangrong.xiao, toshi.kani, dan.j.williams, akpm, jpoimboe,
	corbet, brgerst, torvalds, alpopov, dyoung, bp, luto, thgarnie,
	tglx, peterz, kuleshovmail, jroedel, kirill.shutemov,
	boris.ostrovsky, msalter, bp, matt, JBeulich, schwidefsky, sds,
	hpa, linux-kernel, borntraeger, bhe, mingo, keescook, dvyukov,
	dvlasenk, aneesh.kumar, yinghai, lv.zheng, dave.hansen, jgross

Commit-ID:  59b3d0206d74a700069e49160e8194b2ca93b703
Gitweb:     http://git.kernel.org/tip/59b3d0206d74a700069e49160e8194b2ca93b703
Author:     Thomas Garnier <thgarnie@google.com>
AuthorDate: Tue, 21 Jun 2016 17:46:59 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 8 Jul 2016 17:33:46 +0200

x86/mm: Update physical mapping variable names

Change the variable names in kernel_physical_mapping_init() and related
functions to correctly reflect physical and virtual memory addresses.
Also add comments on each function to describe usage and alignment
constraints.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Cc: Alexander Popov <alpopov@ptsecurity.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lv Zheng <lv.zheng@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-doc@vger.kernel.org
Link: http://lkml.kernel.org/r/1466556426-32664-3-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/mm/init_64.c | 162 ++++++++++++++++++++++++++++++--------------------
 1 file changed, 96 insertions(+), 66 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index bce2e5d..6714712 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -328,22 +328,30 @@ void __init cleanup_highmap(void)
 	}
 }
 
+/*
+ * Create PTE level page table mapping for physical addresses.
+ * It returns the last physical address mapped.
+ */
 static unsigned long __meminit
-phys_pte_init(pte_t *pte_page, unsigned long addr, unsigned long end,
+phys_pte_init(pte_t *pte_page, unsigned long paddr, unsigned long paddr_end,
 	      pgprot_t prot)
 {
-	unsigned long pages = 0, next;
-	unsigned long last_map_addr = end;
+	unsigned long pages = 0, paddr_next;
+	unsigned long paddr_last = paddr_end;
+	pte_t *pte;
 	int i;
 
-	pte_t *pte = pte_page + pte_index(addr);
+	pte = pte_page + pte_index(paddr);
+	i = pte_index(paddr);
 
-	for (i = pte_index(addr); i < PTRS_PER_PTE; i++, addr = next, pte++) {
-		next = (addr & PAGE_MASK) + PAGE_SIZE;
-		if (addr >= end) {
+	for (; i < PTRS_PER_PTE; i++, paddr = paddr_next, pte++) {
+		paddr_next = (paddr & PAGE_MASK) + PAGE_SIZE;
+		if (paddr >= paddr_end) {
 			if (!after_bootmem &&
-			    !e820_any_mapped(addr & PAGE_MASK, next, E820_RAM) &&
-			    !e820_any_mapped(addr & PAGE_MASK, next, E820_RESERVED_KERN))
+			    !e820_any_mapped(paddr & PAGE_MASK, paddr_next,
+					     E820_RAM) &&
+			    !e820_any_mapped(paddr & PAGE_MASK, paddr_next,
+					     E820_RESERVED_KERN))
 				set_pte(pte, __pte(0));
 			continue;
 		}
@@ -361,37 +369,44 @@ phys_pte_init(pte_t *pte_page, unsigned long addr, unsigned long end,
 		}
 
 		if (0)
-			printk("   pte=%p addr=%lx pte=%016lx\n",
-			       pte, addr, pfn_pte(addr >> PAGE_SHIFT, PAGE_KERNEL).pte);
+			pr_info("   pte=%p addr=%lx pte=%016lx\n", pte, paddr,
+				pfn_pte(paddr >> PAGE_SHIFT, PAGE_KERNEL).pte);
 		pages++;
-		set_pte(pte, pfn_pte(addr >> PAGE_SHIFT, prot));
-		last_map_addr = (addr & PAGE_MASK) + PAGE_SIZE;
+		set_pte(pte, pfn_pte(paddr >> PAGE_SHIFT, prot));
+		paddr_last = (paddr & PAGE_MASK) + PAGE_SIZE;
 	}
 
 	update_page_count(PG_LEVEL_4K, pages);
 
-	return last_map_addr;
+	return paddr_last;
 }
 
+/*
+ * Create PMD level page table mapping for physical addresses. The virtual
+ * and physical address have to be aligned at this level.
+ * It returns the last physical address mapped.
+ */
 static unsigned long __meminit
-phys_pmd_init(pmd_t *pmd_page, unsigned long address, unsigned long end,
+phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
 	      unsigned long page_size_mask, pgprot_t prot)
 {
-	unsigned long pages = 0, next;
-	unsigned long last_map_addr = end;
+	unsigned long pages = 0, paddr_next;
+	unsigned long paddr_last = paddr_end;
 
-	int i = pmd_index(address);
+	int i = pmd_index(paddr);
 
-	for (; i < PTRS_PER_PMD; i++, address = next) {
-		pmd_t *pmd = pmd_page + pmd_index(address);
+	for (; i < PTRS_PER_PMD; i++, paddr = paddr_next) {
+		pmd_t *pmd = pmd_page + pmd_index(paddr);
 		pte_t *pte;
 		pgprot_t new_prot = prot;
 
-		next = (address & PMD_MASK) + PMD_SIZE;
-		if (address >= end) {
+		paddr_next = (paddr & PMD_MASK) + PMD_SIZE;
+		if (paddr >= paddr_end) {
 			if (!after_bootmem &&
-			    !e820_any_mapped(address & PMD_MASK, next, E820_RAM) &&
-			    !e820_any_mapped(address & PMD_MASK, next, E820_RESERVED_KERN))
+			    !e820_any_mapped(paddr & PMD_MASK, paddr_next,
+					     E820_RAM) &&
+			    !e820_any_mapped(paddr & PMD_MASK, paddr_next,
+					     E820_RESERVED_KERN))
 				set_pmd(pmd, __pmd(0));
 			continue;
 		}
@@ -400,8 +415,8 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long address, unsigned long end,
 			if (!pmd_large(*pmd)) {
 				spin_lock(&init_mm.page_table_lock);
 				pte = (pte_t *)pmd_page_vaddr(*pmd);
-				last_map_addr = phys_pte_init(pte, address,
-								end, prot);
+				paddr_last = phys_pte_init(pte, paddr,
+							   paddr_end, prot);
 				spin_unlock(&init_mm.page_table_lock);
 				continue;
 			}
@@ -420,7 +435,7 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long address, unsigned long end,
 			if (page_size_mask & (1 << PG_LEVEL_2M)) {
 				if (!after_bootmem)
 					pages++;
-				last_map_addr = next;
+				paddr_last = paddr_next;
 				continue;
 			}
 			new_prot = pte_pgprot(pte_clrhuge(*(pte_t *)pmd));
@@ -430,42 +445,49 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long address, unsigned long end,
 			pages++;
 			spin_lock(&init_mm.page_table_lock);
 			set_pte((pte_t *)pmd,
-				pfn_pte((address & PMD_MASK) >> PAGE_SHIFT,
+				pfn_pte((paddr & PMD_MASK) >> PAGE_SHIFT,
 					__pgprot(pgprot_val(prot) | _PAGE_PSE)));
 			spin_unlock(&init_mm.page_table_lock);
-			last_map_addr = next;
+			paddr_last = paddr_next;
 			continue;
 		}
 
 		pte = alloc_low_page();
-		last_map_addr = phys_pte_init(pte, address, end, new_prot);
+		paddr_last = phys_pte_init(pte, paddr, paddr_end, new_prot);
 
 		spin_lock(&init_mm.page_table_lock);
 		pmd_populate_kernel(&init_mm, pmd, pte);
 		spin_unlock(&init_mm.page_table_lock);
 	}
 	update_page_count(PG_LEVEL_2M, pages);
-	return last_map_addr;
+	return paddr_last;
 }
 
+/*
+ * Create PUD level page table mapping for physical addresses. The virtual
+ * and physical address have to be aligned at this level.
+ * It returns the last physical address mapped.
+ */
 static unsigned long __meminit
-phys_pud_init(pud_t *pud_page, unsigned long addr, unsigned long end,
-			 unsigned long page_size_mask)
+phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
+	      unsigned long page_size_mask)
 {
-	unsigned long pages = 0, next;
-	unsigned long last_map_addr = end;
-	int i = pud_index(addr);
+	unsigned long pages = 0, paddr_next;
+	unsigned long paddr_last = paddr_end;
+	int i = pud_index(paddr);
 
-	for (; i < PTRS_PER_PUD; i++, addr = next) {
-		pud_t *pud = pud_page + pud_index(addr);
+	for (; i < PTRS_PER_PUD; i++, paddr = paddr_next) {
+		pud_t *pud = pud_page + pud_index(paddr);
 		pmd_t *pmd;
 		pgprot_t prot = PAGE_KERNEL;
 
-		next = (addr & PUD_MASK) + PUD_SIZE;
-		if (addr >= end) {
+		paddr_next = (paddr & PUD_MASK) + PUD_SIZE;
+		if (paddr >= paddr_end) {
 			if (!after_bootmem &&
-			    !e820_any_mapped(addr & PUD_MASK, next, E820_RAM) &&
-			    !e820_any_mapped(addr & PUD_MASK, next, E820_RESERVED_KERN))
+			    !e820_any_mapped(paddr & PUD_MASK, paddr_next,
+					     E820_RAM) &&
+			    !e820_any_mapped(paddr & PUD_MASK, paddr_next,
+					     E820_RESERVED_KERN))
 				set_pud(pud, __pud(0));
 			continue;
 		}
@@ -473,8 +495,10 @@ phys_pud_init(pud_t *pud_page, unsigned long addr, unsigned long end,
 		if (pud_val(*pud)) {
 			if (!pud_large(*pud)) {
 				pmd = pmd_offset(pud, 0);
-				last_map_addr = phys_pmd_init(pmd, addr, end,
-							 page_size_mask, prot);
+				paddr_last = phys_pmd_init(pmd, paddr,
+							   paddr_end,
+							   page_size_mask,
+							   prot);
 				__flush_tlb_all();
 				continue;
 			}
@@ -493,7 +517,7 @@ phys_pud_init(pud_t *pud_page, unsigned long addr, unsigned long end,
 			if (page_size_mask & (1 << PG_LEVEL_1G)) {
 				if (!after_bootmem)
 					pages++;
-				last_map_addr = next;
+				paddr_last = paddr_next;
 				continue;
 			}
 			prot = pte_pgprot(pte_clrhuge(*(pte_t *)pud));
@@ -503,16 +527,16 @@ phys_pud_init(pud_t *pud_page, unsigned long addr, unsigned long end,
 			pages++;
 			spin_lock(&init_mm.page_table_lock);
 			set_pte((pte_t *)pud,
-				pfn_pte((addr & PUD_MASK) >> PAGE_SHIFT,
+				pfn_pte((paddr & PUD_MASK) >> PAGE_SHIFT,
 					PAGE_KERNEL_LARGE));
 			spin_unlock(&init_mm.page_table_lock);
-			last_map_addr = next;
+			paddr_last = paddr_next;
 			continue;
 		}
 
 		pmd = alloc_low_page();
-		last_map_addr = phys_pmd_init(pmd, addr, end, page_size_mask,
-					      prot);
+		paddr_last = phys_pmd_init(pmd, paddr, paddr_end,
+					   page_size_mask, prot);
 
 		spin_lock(&init_mm.page_table_lock);
 		pud_populate(&init_mm, pud, pmd);
@@ -522,38 +546,44 @@ phys_pud_init(pud_t *pud_page, unsigned long addr, unsigned long end,
 
 	update_page_count(PG_LEVEL_1G, pages);
 
-	return last_map_addr;
+	return paddr_last;
 }
 
+/*
+ * Create page table mapping for the physical memory for specific physical
+ * addresses. The virtual and physical addresses have to be aligned on PUD level
+ * down. It returns the last physical address mapped.
+ */
 unsigned long __meminit
-kernel_physical_mapping_init(unsigned long start,
-			     unsigned long end,
+kernel_physical_mapping_init(unsigned long paddr_start,
+			     unsigned long paddr_end,
 			     unsigned long page_size_mask)
 {
 	bool pgd_changed = false;
-	unsigned long next, last_map_addr = end;
-	unsigned long addr;
+	unsigned long vaddr, vaddr_start, vaddr_end, vaddr_next, paddr_last;
 
-	start = (unsigned long)__va(start);
-	end = (unsigned long)__va(end);
-	addr = start;
+	paddr_last = paddr_end;
+	vaddr = (unsigned long)__va(paddr_start);
+	vaddr_end = (unsigned long)__va(paddr_end);
+	vaddr_start = vaddr;
 
-	for (; start < end; start = next) {
-		pgd_t *pgd = pgd_offset_k(start);
+	for (; vaddr < vaddr_end; vaddr = vaddr_next) {
+		pgd_t *pgd = pgd_offset_k(vaddr);
 		pud_t *pud;
 
-		next = (start & PGDIR_MASK) + PGDIR_SIZE;
+		vaddr_next = (vaddr & PGDIR_MASK) + PGDIR_SIZE;
 
 		if (pgd_val(*pgd)) {
 			pud = (pud_t *)pgd_page_vaddr(*pgd);
-			last_map_addr = phys_pud_init(pud, __pa(start),
-						 __pa(end), page_size_mask);
+			paddr_last = phys_pud_init(pud, __pa(vaddr),
+						   __pa(vaddr_end),
+						   page_size_mask);
 			continue;
 		}
 
 		pud = alloc_low_page();
-		last_map_addr = phys_pud_init(pud, __pa(start), __pa(end),
-						 page_size_mask);
+		paddr_last = phys_pud_init(pud, __pa(vaddr), __pa(vaddr_end),
+					   page_size_mask);
 
 		spin_lock(&init_mm.page_table_lock);
 		pgd_populate(&init_mm, pgd, pud);
@@ -562,11 +592,11 @@ kernel_physical_mapping_init(unsigned long start,
 	}
 
 	if (pgd_changed)
-		sync_global_pgds(addr, end - 1, 0);
+		sync_global_pgds(vaddr_start, vaddr_end - 1, 0);
 
 	__flush_tlb_all();
 
-	return last_map_addr;
+	return paddr_last;
 }
 
 #ifndef CONFIG_NUMA

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip:x86/boot] x86/mm: Add PUD VA support for physical mapping
  2016-06-22  0:47   ` [kernel-hardening] " Kees Cook
  (?)
@ 2016-07-08 20:34   ` tip-bot for Thomas Garnier
  -1 siblings, 0 replies; 74+ messages in thread
From: tip-bot for Thomas Garnier @ 2016-07-08 20:34 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: boris.ostrovsky, bp, alpopov, yinghai, linux-kernel,
	dan.j.williams, peterz, bp, corbet, toshi.kani, dave.hansen,
	akpm, matt, mingo, keescook, tglx, dvyukov, jgross, dyoung, luto,
	kirill.shutemov, brgerst, torvalds, bhe, lv.zheng, hpa,
	kuleshovmail, msalter, aneesh.kumar, dvlasenk, thgarnie,
	schwidefsky, jpoimboe, guangrong.xiao, borntraeger, jroedel, sds,
	JBeulich

Commit-ID:  faa379332f3cb3375db1849e27386f8bc9b97da4
Gitweb:     http://git.kernel.org/tip/faa379332f3cb3375db1849e27386f8bc9b97da4
Author:     Thomas Garnier <thgarnie@google.com>
AuthorDate: Tue, 21 Jun 2016 17:47:00 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 8 Jul 2016 17:33:46 +0200

x86/mm: Add PUD VA support for physical mapping

Minor change that allows early boot physical mapping of PUD level virtual
addresses. The current implementation expects the virtual address to be
PUD aligned. For KASLR memory randomization, we need to be able to
randomize the offset used on the PUD table.

It has no impact on current usage.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Cc: Alexander Popov <alpopov@ptsecurity.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lv Zheng <lv.zheng@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-doc@vger.kernel.org
Link: http://lkml.kernel.org/r/1466556426-32664-4-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/mm/init_64.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 6714712..7bf1ddb 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -465,7 +465,8 @@ phys_pmd_init(pmd_t *pmd_page, unsigned long paddr, unsigned long paddr_end,
 
 /*
  * Create PUD level page table mapping for physical addresses. The virtual
- * and physical address have to be aligned at this level.
+ * and physical address do not have to be aligned at this level. KASLR can
+ * randomize virtual addresses up to this level.
  * It returns the last physical address mapped.
  */
 static unsigned long __meminit
@@ -474,14 +475,18 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
 {
 	unsigned long pages = 0, paddr_next;
 	unsigned long paddr_last = paddr_end;
-	int i = pud_index(paddr);
+	unsigned long vaddr = (unsigned long)__va(paddr);
+	int i = pud_index(vaddr);
 
 	for (; i < PTRS_PER_PUD; i++, paddr = paddr_next) {
-		pud_t *pud = pud_page + pud_index(paddr);
+		pud_t *pud;
 		pmd_t *pmd;
 		pgprot_t prot = PAGE_KERNEL;
 
+		vaddr = (unsigned long)__va(paddr);
+		pud = pud_page + pud_index(vaddr);
 		paddr_next = (paddr & PUD_MASK) + PUD_SIZE;
+
 		if (paddr >= paddr_end) {
 			if (!after_bootmem &&
 			    !e820_any_mapped(paddr & PUD_MASK, paddr_next,
@@ -551,7 +556,7 @@ phys_pud_init(pud_t *pud_page, unsigned long paddr, unsigned long paddr_end,
 
 /*
  * Create page table mapping for the physical memory for specific physical
- * addresses. The virtual and physical addresses have to be aligned on PUD level
+ * addresses. The virtual and physical addresses have to be aligned on PMD level
  * down. It returns the last physical address mapped.
  */
 unsigned long __meminit

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip:x86/boot] x86/mm: Separate variable for trampoline PGD
  2016-06-22  0:47   ` [kernel-hardening] " Kees Cook
  (?)
@ 2016-07-08 20:35   ` tip-bot for Thomas Garnier
  -1 siblings, 0 replies; 74+ messages in thread
From: tip-bot for Thomas Garnier @ 2016-07-08 20:35 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dvlasenk, schwidefsky, dan.j.williams, JBeulich, guangrong.xiao,
	keescook, lv.zheng, aneesh.kumar, dave.hansen, hpa, matt, luto,
	peterz, akpm, dvyukov, yinghai, dyoung, mingo, corbet,
	linux-kernel, bhe, bp, jgross, jroedel, alpopov, torvalds,
	thgarnie, tglx, kuleshovmail, toshi.kani, msalter, jpoimboe,
	brgerst, kirill.shutemov, bp, sds, boris.ostrovsky, borntraeger

Commit-ID:  b234e8a09003af108d3573f0369e25c080676b14
Gitweb:     http://git.kernel.org/tip/b234e8a09003af108d3573f0369e25c080676b14
Author:     Thomas Garnier <thgarnie@google.com>
AuthorDate: Tue, 21 Jun 2016 17:47:01 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 8 Jul 2016 17:33:46 +0200

x86/mm: Separate variable for trampoline PGD

Use a separate global variable to define the trampoline PGD used to
start other processors. This change will allow KALSR memory
randomization to change the trampoline PGD to be correctly aligned with
physical memory.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Cc: Alexander Popov <alpopov@ptsecurity.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lv Zheng <lv.zheng@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-doc@vger.kernel.org
Link: http://lkml.kernel.org/r/1466556426-32664-5-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/pgtable.h | 12 ++++++++++++
 arch/x86/mm/init.c             |  3 +++
 arch/x86/realmode/init.c       |  5 ++++-
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 1a27396..d455bef 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -729,6 +729,18 @@ extern int direct_gbpages;
 void init_mem_mapping(void);
 void early_alloc_pgt_buf(void);
 
+#ifdef CONFIG_X86_64
+/* Realmode trampoline initialization. */
+extern pgd_t trampoline_pgd_entry;
+static inline void __meminit init_trampoline(void)
+{
+	/* Default trampoline pgd value */
+	trampoline_pgd_entry = init_level4_pgt[pgd_index(__PAGE_OFFSET)];
+}
+#else
+static inline void init_trampoline(void) { }
+#endif
+
 /* local pte updates need not use xchg for locking */
 static inline pte_t native_local_ptep_get_and_clear(pte_t *ptep)
 {
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 372aad2..4252acd 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -590,6 +590,9 @@ void __init init_mem_mapping(void)
 	/* the ISA range is always mapped regardless of memory holes */
 	init_memory_mapping(0, ISA_END_ADDRESS);
 
+	/* Init the trampoline, possibly with KASLR memory offset */
+	init_trampoline();
+
 	/*
 	 * If the allocation is in bottom-up direction, we setup direct mapping
 	 * in bottom-up, otherwise we setup direct mapping in top-down.
diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 0b7a63d..705e3ff 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -8,6 +8,9 @@
 struct real_mode_header *real_mode_header;
 u32 *trampoline_cr4_features;
 
+/* Hold the pgd entry used on booting additional CPUs */
+pgd_t trampoline_pgd_entry;
+
 void __init reserve_real_mode(void)
 {
 	phys_addr_t mem;
@@ -84,7 +87,7 @@ void __init setup_real_mode(void)
 	*trampoline_cr4_features = __read_cr4();
 
 	trampoline_pgd = (u64 *) __va(real_mode_header->trampoline_pgd);
-	trampoline_pgd[0] = init_level4_pgt[pgd_index(__PAGE_OFFSET)].pgd;
+	trampoline_pgd[0] = trampoline_pgd_entry.pgd;
 	trampoline_pgd[511] = init_level4_pgt[511].pgd;
 #endif
 }

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip:x86/boot] x86/mm: Implement ASLR for kernel memory regions
  2016-06-22  0:47   ` [kernel-hardening] " Kees Cook
  (?)
@ 2016-07-08 20:35   ` tip-bot for Thomas Garnier
  -1 siblings, 0 replies; 74+ messages in thread
From: tip-bot for Thomas Garnier @ 2016-07-08 20:35 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, dyoung, mingo, peterz, guangrong.xiao, toshi.kani, corbet,
	luto, kirill.shutemov, JBeulich, aneesh.kumar, dvlasenk, dvyukov,
	hpa, sds, borntraeger, alpopov, lv.zheng, thgarnie, bhe, msalter,
	kuleshovmail, dan.j.williams, dave.hansen, schwidefsky, keescook,
	bp, yinghai, brgerst, jgross, torvalds, matt, boris.ostrovsky,
	jpoimboe, akpm, bp, jroedel, linux-kernel

Commit-ID:  0483e1fa6e09d4948272680f691dccb1edb9677f
Gitweb:     http://git.kernel.org/tip/0483e1fa6e09d4948272680f691dccb1edb9677f
Author:     Thomas Garnier <thgarnie@google.com>
AuthorDate: Tue, 21 Jun 2016 17:47:02 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 8 Jul 2016 17:33:46 +0200

x86/mm: Implement ASLR for kernel memory regions

Randomizes the virtual address space of kernel memory regions for
x86_64. This first patch adds the infrastructure and does not randomize
any region. The following patches will randomize the physical memory
mapping, vmalloc and vmemmap regions.

This security feature mitigates exploits relying on predictable kernel
addresses. These addresses can be used to disclose the kernel modules
base addresses or corrupt specific structures to elevate privileges
bypassing the current implementation of KASLR. This feature can be
enabled with the CONFIG_RANDOMIZE_MEMORY option.

The order of each memory region is not changed. The feature looks at the
available space for the regions based on different configuration options
and randomizes the base and space between each. The size of the physical
memory mapping is the available physical memory. No performance impact
was detected while testing the feature.

Entropy is generated using the KASLR early boot functions now shared in
the lib directory (originally written by Kees Cook). Randomization is
done on PGD & PUD page table levels to increase possible addresses. The
physical memory mapping code was adapted to support PUD level virtual
addresses. This implementation on the best configuration provides 30,000
possible virtual addresses in average for each memory region.  An
additional low memory page is used to ensure each CPU can start with a
PGD aligned virtual address (for realmode).

x86/dump_pagetable was updated to correctly display each region.

Updated documentation on x86_64 memory layout accordingly.

Performance data, after all patches in the series:

Kernbench shows almost no difference (-+ less than 1%):

Before:

Average Optimal load -j 12 Run (std deviation): Elapsed Time 102.63 (1.2695)
User Time 1034.89 (1.18115) System Time 87.056 (0.456416) Percent CPU 1092.9
(13.892) Context Switches 199805 (3455.33) Sleeps 97907.8 (900.636)

After:

Average Optimal load -j 12 Run (std deviation): Elapsed Time 102.489 (1.10636)
User Time 1034.86 (1.36053) System Time 87.764 (0.49345) Percent CPU 1095
(12.7715) Context Switches 199036 (4298.1) Sleeps 97681.6 (1031.11)

Hackbench shows 0% difference on average (hackbench 90 repeated 10 times):

attemp,before,after 1,0.076,0.069 2,0.072,0.069 3,0.066,0.066 4,0.066,0.068
5,0.066,0.067 6,0.066,0.069 7,0.067,0.066 8,0.063,0.067 9,0.067,0.065
10,0.068,0.071 average,0.0677,0.0677

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Cc: Alexander Popov <alpopov@ptsecurity.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lv Zheng <lv.zheng@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-doc@vger.kernel.org
Link: http://lkml.kernel.org/r/1466556426-32664-6-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 Documentation/x86/x86_64/mm.txt |   4 ++
 arch/x86/Kconfig                |  17 +++++
 arch/x86/include/asm/kaslr.h    |   6 ++
 arch/x86/include/asm/pgtable.h  |   7 +-
 arch/x86/kernel/setup.c         |   3 +
 arch/x86/mm/Makefile            |   1 +
 arch/x86/mm/dump_pagetables.c   |  16 +++--
 arch/x86/mm/init.c              |   1 +
 arch/x86/mm/kaslr.c             | 152 ++++++++++++++++++++++++++++++++++++++++
 9 files changed, 202 insertions(+), 5 deletions(-)

diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index 5aa7383..8c7dd59 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -39,4 +39,8 @@ memory window (this size is arbitrary, it can be raised later if needed).
 The mappings are not part of any other kernel PGD and are only available
 during EFI runtime calls.
 
+Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all
+physical memory, vmalloc/ioremap space and virtual memory map are randomized.
+Their order is preserved but their base will be offset early at boot time.
+
 -Andi Kleen, Jul 2004
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 930fe88..9719b8e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1993,6 +1993,23 @@ config PHYSICAL_ALIGN
 
 	  Don't change this unless you know what you are doing.
 
+config RANDOMIZE_MEMORY
+	bool "Randomize the kernel memory sections"
+	depends on X86_64
+	depends on RANDOMIZE_BASE
+	default RANDOMIZE_BASE
+	---help---
+	   Randomizes the base virtual address of kernel memory sections
+	   (physical memory mapping, vmalloc & vmemmap). This security feature
+	   makes exploits relying on predictable memory locations less reliable.
+
+	   The order of allocations remains unchanged. Entropy is generated in
+	   the same way as RANDOMIZE_BASE. Current implementation in the optimal
+	   configuration have in average 30,000 different possible virtual
+	   addresses for each memory section.
+
+	   If unsure, say N.
+
 config HOTPLUG_CPU
 	bool "Support for hot-pluggable CPUs"
 	depends on SMP
diff --git a/arch/x86/include/asm/kaslr.h b/arch/x86/include/asm/kaslr.h
index 5547438..683c9d7 100644
--- a/arch/x86/include/asm/kaslr.h
+++ b/arch/x86/include/asm/kaslr.h
@@ -3,4 +3,10 @@
 
 unsigned long kaslr_get_random_long(const char *purpose);
 
+#ifdef CONFIG_RANDOMIZE_MEMORY
+void kernel_randomize_memory(void);
+#else
+static inline void kernel_randomize_memory(void) { }
+#endif /* CONFIG_RANDOMIZE_MEMORY */
+
 #endif
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index d455bef..5472682 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -732,11 +732,16 @@ void early_alloc_pgt_buf(void);
 #ifdef CONFIG_X86_64
 /* Realmode trampoline initialization. */
 extern pgd_t trampoline_pgd_entry;
-static inline void __meminit init_trampoline(void)
+static inline void __meminit init_trampoline_default(void)
 {
 	/* Default trampoline pgd value */
 	trampoline_pgd_entry = init_level4_pgt[pgd_index(__PAGE_OFFSET)];
 }
+# ifdef CONFIG_RANDOMIZE_MEMORY
+void __meminit init_trampoline(void);
+# else
+#  define init_trampoline init_trampoline_default
+# endif
 #else
 static inline void init_trampoline(void) { }
 #endif
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index c4e7b39..a261658 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -113,6 +113,7 @@
 #include <asm/prom.h>
 #include <asm/microcode.h>
 #include <asm/mmu_context.h>
+#include <asm/kaslr.h>
 
 /*
  * max_low_pfn_mapped: highest direct mapped pfn under 4GB
@@ -942,6 +943,8 @@ void __init setup_arch(char **cmdline_p)
 
 	x86_init.oem.arch_setup();
 
+	kernel_randomize_memory();
+
 	iomem_resource.end = (1ULL << boot_cpu_data.x86_phys_bits) - 1;
 	setup_memory_map();
 	parse_setup_data();
diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile
index 62c0043..96d2b84 100644
--- a/arch/x86/mm/Makefile
+++ b/arch/x86/mm/Makefile
@@ -37,4 +37,5 @@ obj-$(CONFIG_NUMA_EMU)		+= numa_emulation.o
 
 obj-$(CONFIG_X86_INTEL_MPX)	+= mpx.o
 obj-$(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) += pkeys.o
+obj-$(CONFIG_RANDOMIZE_MEMORY) += kaslr.o
 
diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index 99bfb19..9a17250 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -72,9 +72,9 @@ static struct addr_marker address_markers[] = {
 	{ 0, "User Space" },
 #ifdef CONFIG_X86_64
 	{ 0x8000000000000000UL, "Kernel Space" },
-	{ PAGE_OFFSET,		"Low Kernel Mapping" },
-	{ VMALLOC_START,        "vmalloc() Area" },
-	{ VMEMMAP_START,        "Vmemmap" },
+	{ 0/* PAGE_OFFSET */,   "Low Kernel Mapping" },
+	{ 0/* VMALLOC_START */, "vmalloc() Area" },
+	{ 0/* VMEMMAP_START */, "Vmemmap" },
 # ifdef CONFIG_X86_ESPFIX64
 	{ ESPFIX_BASE_ADDR,	"ESPfix Area", 16 },
 # endif
@@ -434,8 +434,16 @@ void ptdump_walk_pgd_level_checkwx(void)
 
 static int __init pt_dump_init(void)
 {
+	/*
+	 * Various markers are not compile-time constants, so assign them
+	 * here.
+	 */
+#ifdef CONFIG_X86_64
+	address_markers[LOW_KERNEL_NR].start_address = PAGE_OFFSET;
+	address_markers[VMALLOC_START_NR].start_address = VMALLOC_START;
+	address_markers[VMEMMAP_START_NR].start_address = VMEMMAP_START;
+#endif
 #ifdef CONFIG_X86_32
-	/* Not a compile-time constant on x86-32 */
 	address_markers[VMALLOC_START_NR].start_address = VMALLOC_START;
 	address_markers[VMALLOC_END_NR].start_address = VMALLOC_END;
 # ifdef CONFIG_HIGHMEM
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 4252acd..cc82830 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -17,6 +17,7 @@
 #include <asm/proto.h>
 #include <asm/dma.h>		/* for MAX_DMA_PFN */
 #include <asm/microcode.h>
+#include <asm/kaslr.h>
 
 /*
  * We need to define the tracepoints somewhere, and tlb.c
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
new file mode 100644
index 0000000..d5380a4
--- /dev/null
+++ b/arch/x86/mm/kaslr.c
@@ -0,0 +1,152 @@
+/*
+ * This file implements KASLR memory randomization for x86_64. It randomizes
+ * the virtual address space of kernel memory regions (physical memory
+ * mapping, vmalloc & vmemmap) for x86_64. This security feature mitigates
+ * exploits relying on predictable kernel addresses.
+ *
+ * Entropy is generated using the KASLR early boot functions now shared in
+ * the lib directory (originally written by Kees Cook). Randomization is
+ * done on PGD & PUD page table levels to increase possible addresses. The
+ * physical memory mapping code was adapted to support PUD level virtual
+ * addresses. This implementation on the best configuration provides 30,000
+ * possible virtual addresses in average for each memory region. An additional
+ * low memory page is used to ensure each CPU can start with a PGD aligned
+ * virtual address (for realmode).
+ *
+ * The order of each memory region is not changed. The feature looks at
+ * the available space for the regions based on different configuration
+ * options and randomizes the base and space between each. The size of the
+ * physical memory mapping is the available physical memory.
+ */
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/random.h>
+
+#include <asm/pgalloc.h>
+#include <asm/pgtable.h>
+#include <asm/setup.h>
+#include <asm/kaslr.h>
+
+#include "mm_internal.h"
+
+#define TB_SHIFT 40
+
+/*
+ * Virtual address start and end range for randomization. The end changes base
+ * on configuration to have the highest amount of space for randomization.
+ * It increases the possible random position for each randomized region.
+ *
+ * You need to add an if/def entry if you introduce a new memory region
+ * compatible with KASLR. Your entry must be in logical order with memory
+ * layout. For example, ESPFIX is before EFI because its virtual address is
+ * before. You also need to add a BUILD_BUG_ON in kernel_randomize_memory to
+ * ensure that this order is correct and won't be changed.
+ */
+static const unsigned long vaddr_start;
+static const unsigned long vaddr_end;
+
+/*
+ * Memory regions randomized by KASLR (except modules that use a separate logic
+ * earlier during boot). The list is ordered based on virtual addresses. This
+ * order is kept after randomization.
+ */
+static __initdata struct kaslr_memory_region {
+	unsigned long *base;
+	unsigned long size_tb;
+} kaslr_regions[] = {
+};
+
+/* Get size in bytes used by the memory region */
+static inline unsigned long get_padding(struct kaslr_memory_region *region)
+{
+	return (region->size_tb << TB_SHIFT);
+}
+
+/*
+ * Apply no randomization if KASLR was disabled at boot or if KASAN
+ * is enabled. KASAN shadow mappings rely on regions being PGD aligned.
+ */
+static inline bool kaslr_memory_enabled(void)
+{
+	return kaslr_enabled() && !config_enabled(CONFIG_KASAN);
+}
+
+/* Initialize base and padding for each memory region randomized with KASLR */
+void __init kernel_randomize_memory(void)
+{
+	size_t i;
+	unsigned long vaddr = vaddr_start;
+	unsigned long rand;
+	struct rnd_state rand_state;
+	unsigned long remain_entropy;
+
+	if (!kaslr_memory_enabled())
+		return;
+
+	/* Calculate entropy available between regions */
+	remain_entropy = vaddr_end - vaddr_start;
+	for (i = 0; i < ARRAY_SIZE(kaslr_regions); i++)
+		remain_entropy -= get_padding(&kaslr_regions[i]);
+
+	prandom_seed_state(&rand_state, kaslr_get_random_long("Memory"));
+
+	for (i = 0; i < ARRAY_SIZE(kaslr_regions); i++) {
+		unsigned long entropy;
+
+		/*
+		 * Select a random virtual address using the extra entropy
+		 * available.
+		 */
+		entropy = remain_entropy / (ARRAY_SIZE(kaslr_regions) - i);
+		prandom_bytes_state(&rand_state, &rand, sizeof(rand));
+		entropy = (rand % (entropy + 1)) & PUD_MASK;
+		vaddr += entropy;
+		*kaslr_regions[i].base = vaddr;
+
+		/*
+		 * Jump the region and add a minimum padding based on
+		 * randomization alignment.
+		 */
+		vaddr += get_padding(&kaslr_regions[i]);
+		vaddr = round_up(vaddr + 1, PUD_SIZE);
+		remain_entropy -= entropy;
+	}
+}
+
+/*
+ * Create PGD aligned trampoline table to allow real mode initialization
+ * of additional CPUs. Consume only 1 low memory page.
+ */
+void __meminit init_trampoline(void)
+{
+	unsigned long paddr, paddr_next;
+	pgd_t *pgd;
+	pud_t *pud_page, *pud_page_tramp;
+	int i;
+
+	if (!kaslr_memory_enabled()) {
+		init_trampoline_default();
+		return;
+	}
+
+	pud_page_tramp = alloc_low_page();
+
+	paddr = 0;
+	pgd = pgd_offset_k((unsigned long)__va(paddr));
+	pud_page = (pud_t *) pgd_page_vaddr(*pgd);
+
+	for (i = pud_index(paddr); i < PTRS_PER_PUD; i++, paddr = paddr_next) {
+		pud_t *pud, *pud_tramp;
+		unsigned long vaddr = (unsigned long)__va(paddr);
+
+		pud_tramp = pud_page_tramp + pud_index(paddr);
+		pud = pud_page + pud_index(vaddr);
+		paddr_next = (paddr & PUD_MASK) + PUD_SIZE;
+
+		*pud_tramp = *pud;
+	}
+
+	set_pgd(&trampoline_pgd_entry,
+		__pgd(_KERNPG_TABLE | __pa(pud_page_tramp)));
+}

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip:x86/boot] x86/mm: Enable KASLR for physical mapping memory regions
  2016-06-22  0:47   ` [kernel-hardening] " Kees Cook
  (?)
@ 2016-07-08 20:35   ` tip-bot for Thomas Garnier
  2016-08-14  4:25     ` Brian Gerst
  -1 siblings, 1 reply; 74+ messages in thread
From: tip-bot for Thomas Garnier @ 2016-07-08 20:35 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: schwidefsky, lv.zheng, dvyukov, tglx, dyoung, boris.ostrovsky,
	peterz, dvlasenk, akpm, JBeulich, linux-kernel, toshi.kani,
	alpopov, torvalds, dave.hansen, mingo, aneesh.kumar,
	guangrong.xiao, matt, corbet, bp, msalter, dan.j.williams,
	keescook, brgerst, bhe, jroedel, kirill.shutemov, sds,
	borntraeger, jpoimboe, kuleshovmail, hpa, luto, jgross, thgarnie,
	yinghai, bp

Commit-ID:  021182e52fe01c1f7b126f97fd6ba048dc4234fd
Gitweb:     http://git.kernel.org/tip/021182e52fe01c1f7b126f97fd6ba048dc4234fd
Author:     Thomas Garnier <thgarnie@google.com>
AuthorDate: Tue, 21 Jun 2016 17:47:03 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 8 Jul 2016 17:35:15 +0200

x86/mm: Enable KASLR for physical mapping memory regions

Add the physical mapping in the list of randomized memory regions.

The physical memory mapping holds most allocations from boot and heap
allocators. Knowing the base address and physical memory size, an attacker
can deduce the PDE virtual address for the vDSO memory page. This attack
was demonstrated at CanSecWest 2016, in the following presentation:

  "Getting Physical: Extreme Abuse of Intel Based Paged Systems":
  https://github.com/n3k/CansecWest2016_Getting_Physical_Extreme_Abuse_of_Intel_Based_Paging_Systems/blob/master/Presentation/CanSec2016_Presentation.pdf

(See second part of the presentation).

The exploits used against Linux worked successfully against 4.6+ but
fail with KASLR memory enabled:

  https://github.com/n3k/CansecWest2016_Getting_Physical_Extreme_Abuse_of_Intel_Based_Paging_Systems/tree/master/Demos/Linux/exploits

Similar research was done at Google leading to this patch proposal.

Variants exists to overwrite /proc or /sys objects ACLs leading to
elevation of privileges. These variants were tested against 4.6+.

The page offset used by the compressed kernel retains the static value
since it is not yet randomized during this boot stage.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Cc: Alexander Popov <alpopov@ptsecurity.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lv Zheng <lv.zheng@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-doc@vger.kernel.org
Link: http://lkml.kernel.org/r/1466556426-32664-7-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/boot/compressed/pagetable.c |  3 +++
 arch/x86/include/asm/kaslr.h         |  2 ++
 arch/x86/include/asm/page_64_types.h | 11 ++++++++++-
 arch/x86/kernel/head_64.S            |  2 +-
 arch/x86/mm/kaslr.c                  | 18 +++++++++++++++---
 5 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/arch/x86/boot/compressed/pagetable.c b/arch/x86/boot/compressed/pagetable.c
index 6e31a6a..56589d0 100644
--- a/arch/x86/boot/compressed/pagetable.c
+++ b/arch/x86/boot/compressed/pagetable.c
@@ -20,6 +20,9 @@
 /* These actually do the work of building the kernel identity maps. */
 #include <asm/init.h>
 #include <asm/pgtable.h>
+/* Use the static base for this part of the boot process */
+#undef __PAGE_OFFSET
+#define __PAGE_OFFSET __PAGE_OFFSET_BASE
 #include "../../mm/ident_map.c"
 
 /* Used by pgtable.h asm code to force instruction serialization. */
diff --git a/arch/x86/include/asm/kaslr.h b/arch/x86/include/asm/kaslr.h
index 683c9d7..62b1b81 100644
--- a/arch/x86/include/asm/kaslr.h
+++ b/arch/x86/include/asm/kaslr.h
@@ -4,6 +4,8 @@
 unsigned long kaslr_get_random_long(const char *purpose);
 
 #ifdef CONFIG_RANDOMIZE_MEMORY
+extern unsigned long page_offset_base;
+
 void kernel_randomize_memory(void);
 #else
 static inline void kernel_randomize_memory(void) { }
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index d5c2f8b..9215e05 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -1,6 +1,10 @@
 #ifndef _ASM_X86_PAGE_64_DEFS_H
 #define _ASM_X86_PAGE_64_DEFS_H
 
+#ifndef __ASSEMBLY__
+#include <asm/kaslr.h>
+#endif
+
 #ifdef CONFIG_KASAN
 #define KASAN_STACK_ORDER 1
 #else
@@ -32,7 +36,12 @@
  * hypervisor to fit.  Choosing 16 slots here is arbitrary, but it's
  * what Xen requires.
  */
-#define __PAGE_OFFSET           _AC(0xffff880000000000, UL)
+#define __PAGE_OFFSET_BASE      _AC(0xffff880000000000, UL)
+#ifdef CONFIG_RANDOMIZE_MEMORY
+#define __PAGE_OFFSET           page_offset_base
+#else
+#define __PAGE_OFFSET           __PAGE_OFFSET_BASE
+#endif /* CONFIG_RANDOMIZE_MEMORY */
 
 #define __START_KERNEL_map	_AC(0xffffffff80000000, UL)
 
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index c7920ba..9f8efc9 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -38,7 +38,7 @@
 
 #define pud_index(x)	(((x) >> PUD_SHIFT) & (PTRS_PER_PUD-1))
 
-L4_PAGE_OFFSET = pgd_index(__PAGE_OFFSET)
+L4_PAGE_OFFSET = pgd_index(__PAGE_OFFSET_BASE)
 L4_START_KERNEL = pgd_index(__START_KERNEL_map)
 L3_START_KERNEL = pud_index(__START_KERNEL_map)
 
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index d5380a4..609ecf2 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -43,8 +43,12 @@
  * before. You also need to add a BUILD_BUG_ON in kernel_randomize_memory to
  * ensure that this order is correct and won't be changed.
  */
-static const unsigned long vaddr_start;
-static const unsigned long vaddr_end;
+static const unsigned long vaddr_start = __PAGE_OFFSET_BASE;
+static const unsigned long vaddr_end = VMALLOC_START;
+
+/* Default values */
+unsigned long page_offset_base = __PAGE_OFFSET_BASE;
+EXPORT_SYMBOL(page_offset_base);
 
 /*
  * Memory regions randomized by KASLR (except modules that use a separate logic
@@ -55,6 +59,7 @@ static __initdata struct kaslr_memory_region {
 	unsigned long *base;
 	unsigned long size_tb;
 } kaslr_regions[] = {
+	{ &page_offset_base, 64/* Maximum */ },
 };
 
 /* Get size in bytes used by the memory region */
@@ -77,13 +82,20 @@ void __init kernel_randomize_memory(void)
 {
 	size_t i;
 	unsigned long vaddr = vaddr_start;
-	unsigned long rand;
+	unsigned long rand, memory_tb;
 	struct rnd_state rand_state;
 	unsigned long remain_entropy;
 
 	if (!kaslr_memory_enabled())
 		return;
 
+	BUG_ON(kaslr_regions[0].base != &page_offset_base);
+	memory_tb = ((max_pfn << PAGE_SHIFT) >> TB_SHIFT);
+
+	/* Adapt phyiscal memory region size based on available memory */
+	if (memory_tb < kaslr_regions[0].size_tb)
+		kaslr_regions[0].size_tb = memory_tb;
+
 	/* Calculate entropy available between regions */
 	remain_entropy = vaddr_end - vaddr_start;
 	for (i = 0; i < ARRAY_SIZE(kaslr_regions); i++)

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip:x86/boot] x86/mm: Enable KASLR for vmalloc memory regions
  2016-06-22  0:47   ` [kernel-hardening] " Kees Cook
  (?)
@ 2016-07-08 20:36   ` tip-bot for Thomas Garnier
  -1 siblings, 0 replies; 74+ messages in thread
From: tip-bot for Thomas Garnier @ 2016-07-08 20:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: lv.zheng, dvlasenk, kuleshovmail, aneesh.kumar, sds, toshi.kani,
	jgross, dyoung, schwidefsky, hpa, alpopov, yinghai, jroedel,
	dan.j.williams, dvyukov, boris.ostrovsky, kirill.shutemov,
	guangrong.xiao, linux-kernel, torvalds, dave.hansen, matt, tglx,
	brgerst, msalter, jpoimboe, keescook, JBeulich, corbet, bp,
	borntraeger, peterz, luto, bp, mingo, bhe, thgarnie, akpm

Commit-ID:  a95ae27c2ee1cba5f4f6b9dea43ffe88252e79b1
Gitweb:     http://git.kernel.org/tip/a95ae27c2ee1cba5f4f6b9dea43ffe88252e79b1
Author:     Thomas Garnier <thgarnie@google.com>
AuthorDate: Tue, 21 Jun 2016 17:47:04 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 8 Jul 2016 17:35:21 +0200

x86/mm: Enable KASLR for vmalloc memory regions

Add vmalloc to the list of randomized memory regions.

The vmalloc memory region contains the allocation made through the vmalloc()
API. The allocations are done sequentially to prevent fragmentation and
each allocation address can easily be deduced especially from boot.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Cc: Alexander Popov <alpopov@ptsecurity.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lv Zheng <lv.zheng@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-doc@vger.kernel.org
Link: http://lkml.kernel.org/r/1466556426-32664-8-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/include/asm/kaslr.h            |  1 +
 arch/x86/include/asm/pgtable_64_types.h | 15 +++++++++++----
 arch/x86/mm/kaslr.c                     |  5 ++++-
 3 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kaslr.h b/arch/x86/include/asm/kaslr.h
index 62b1b81..2674ee3 100644
--- a/arch/x86/include/asm/kaslr.h
+++ b/arch/x86/include/asm/kaslr.h
@@ -5,6 +5,7 @@ unsigned long kaslr_get_random_long(const char *purpose);
 
 #ifdef CONFIG_RANDOMIZE_MEMORY
 extern unsigned long page_offset_base;
+extern unsigned long vmalloc_base;
 
 void kernel_randomize_memory(void);
 #else
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index e6844df..6fdef9e 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -5,6 +5,7 @@
 
 #ifndef __ASSEMBLY__
 #include <linux/types.h>
+#include <asm/kaslr.h>
 
 /*
  * These are used to make use of C type-checking..
@@ -53,10 +54,16 @@ typedef struct { pteval_t pte; } pte_t;
 #define PGDIR_MASK	(~(PGDIR_SIZE - 1))
 
 /* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */
-#define MAXMEM		 _AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL)
-#define VMALLOC_START    _AC(0xffffc90000000000, UL)
-#define VMALLOC_END      _AC(0xffffe8ffffffffff, UL)
-#define VMEMMAP_START	 _AC(0xffffea0000000000, UL)
+#define MAXMEM		_AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL)
+#define VMALLOC_SIZE_TB	_AC(32, UL)
+#define __VMALLOC_BASE	_AC(0xffffc90000000000, UL)
+#define VMEMMAP_START	_AC(0xffffea0000000000, UL)
+#ifdef CONFIG_RANDOMIZE_MEMORY
+#define VMALLOC_START	vmalloc_base
+#else
+#define VMALLOC_START	__VMALLOC_BASE
+#endif /* CONFIG_RANDOMIZE_MEMORY */
+#define VMALLOC_END	(VMALLOC_START + _AC((VMALLOC_SIZE_TB << 40) - 1, UL))
 #define MODULES_VADDR    (__START_KERNEL_map + KERNEL_IMAGE_SIZE)
 #define MODULES_END      _AC(0xffffffffff000000, UL)
 #define MODULES_LEN   (MODULES_END - MODULES_VADDR)
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 609ecf2..c939cfe 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -44,11 +44,13 @@
  * ensure that this order is correct and won't be changed.
  */
 static const unsigned long vaddr_start = __PAGE_OFFSET_BASE;
-static const unsigned long vaddr_end = VMALLOC_START;
+static const unsigned long vaddr_end = VMEMMAP_START;
 
 /* Default values */
 unsigned long page_offset_base = __PAGE_OFFSET_BASE;
 EXPORT_SYMBOL(page_offset_base);
+unsigned long vmalloc_base = __VMALLOC_BASE;
+EXPORT_SYMBOL(vmalloc_base);
 
 /*
  * Memory regions randomized by KASLR (except modules that use a separate logic
@@ -60,6 +62,7 @@ static __initdata struct kaslr_memory_region {
 	unsigned long size_tb;
 } kaslr_regions[] = {
 	{ &page_offset_base, 64/* Maximum */ },
+	{ &vmalloc_base, VMALLOC_SIZE_TB },
 };
 
 /* Get size in bytes used by the memory region */

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip:x86/boot] x86/mm: Add memory hotplug support for KASLR memory randomization
  2016-06-22  0:47   ` [kernel-hardening] " Kees Cook
  (?)
@ 2016-07-08 20:36   ` tip-bot for Thomas Garnier
  -1 siblings, 0 replies; 74+ messages in thread
From: tip-bot for Thomas Garnier @ 2016-07-08 20:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: mingo, kuleshovmail, hpa, boris.ostrovsky, kirill.shutemov,
	toshi.kani, brgerst, dyoung, alpopov, JBeulich, dvyukov,
	schwidefsky, corbet, tglx, bp, guangrong.xiao, akpm, matt,
	aneesh.kumar, lv.zheng, dave.hansen, jpoimboe, dan.j.williams,
	luto, keescook, thgarnie, borntraeger, dvlasenk, bp, bhe, jgross,
	jroedel, linux-kernel, torvalds, peterz, yinghai, sds, msalter

Commit-ID:  90397a41779645d3abba5599f6bb538fdcab9339
Gitweb:     http://git.kernel.org/tip/90397a41779645d3abba5599f6bb538fdcab9339
Author:     Thomas Garnier <thgarnie@google.com>
AuthorDate: Tue, 21 Jun 2016 17:47:06 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 8 Jul 2016 17:35:21 +0200

x86/mm: Add memory hotplug support for KASLR memory randomization

Add a new option (CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING) to define
the padding used for the physical memory mapping section when KASLR
memory is enabled. It ensures there is enough virtual address space when
CONFIG_MEMORY_HOTPLUG is used. The default value is 10 terabytes. If
CONFIG_MEMORY_HOTPLUG is not used, no space is reserved increasing the
entropy available.

Signed-off-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Alexander Kuleshov <kuleshovmail@gmail.com>
Cc: Alexander Popov <alpopov@ptsecurity.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Lv Zheng <lv.zheng@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: kernel-hardening@lists.openwall.com
Cc: linux-doc@vger.kernel.org
Link: http://lkml.kernel.org/r/1466556426-32664-10-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/Kconfig    | 15 +++++++++++++++
 arch/x86/mm/kaslr.c |  7 ++++++-
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 9719b8e..703413f 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2010,6 +2010,21 @@ config RANDOMIZE_MEMORY
 
 	   If unsure, say N.
 
+config RANDOMIZE_MEMORY_PHYSICAL_PADDING
+	hex "Physical memory mapping padding" if EXPERT
+	depends on RANDOMIZE_MEMORY
+	default "0xa" if MEMORY_HOTPLUG
+	default "0x0"
+	range 0x1 0x40 if MEMORY_HOTPLUG
+	range 0x0 0x40
+	---help---
+	   Define the padding in terabytes added to the existing physical
+	   memory size during kernel memory randomization. It is useful
+	   for memory hotplug support but reduces the entropy available for
+	   address randomization.
+
+	   If unsure, leave at the default value.
+
 config HOTPLUG_CPU
 	bool "Support for hot-pluggable CPUs"
 	depends on SMP
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index c939cfe..26dccd6 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -92,8 +92,13 @@ void __init kernel_randomize_memory(void)
 	if (!kaslr_memory_enabled())
 		return;
 
+	/*
+	 * Update Physical memory mapping to available and
+	 * add padding if needed (especially for memory hotplug support).
+	 */
 	BUG_ON(kaslr_regions[0].base != &page_offset_base);
-	memory_tb = ((max_pfn << PAGE_SHIFT) >> TB_SHIFT);
+	memory_tb = ((max_pfn << PAGE_SHIFT) >> TB_SHIFT) +
+		CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING;
 
 	/* Adapt phyiscal memory region size based on available memory */
 	if (memory_tb < kaslr_regions[0].size_tb)

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [tip:x86/boot] x86/mm: Enable KASLR for physical mapping memory regions
  2016-07-08 20:35   ` [tip:x86/boot] x86/mm: Enable KASLR for physical mapping memory regions tip-bot for Thomas Garnier
@ 2016-08-14  4:25     ` Brian Gerst
  2016-08-14 23:26       ` Baoquan He
  0 siblings, 1 reply; 74+ messages in thread
From: Brian Gerst @ 2016-08-14  4:25 UTC (permalink / raw)
  To: Borislav Petkov, Yinghai Lu, Juergen Gross, thgarnie,
	Andy Lutomirski, H. Peter Anvin, Alexander Kuleshov,
	Josh Poimboeuf, borntraeger, sds, kirill.shutemov, jroedel, bhe,
	Brian Gerst, Kees Cook, Williams, Dan J, Mark Salter,
	Borislav Petkov, corbet, matt, guangrong.xiao, aneesh.kumar,
	Ingo Molnar, Linus Torvalds, Dave Hansen, toshi.kani, alpopov,
	Linux Kernel Mailing List, Jan Beulich, Andrew Morton,
	Boris Ostrovsky, Denys Vlasenko, Peter Zijlstra, dyoung,
	Thomas Gleixner, Dmitry Vyukov, lv.zheng, schwidefsky
  Cc: linux-tip-commits

On Fri, Jul 8, 2016 at 4:35 PM, tip-bot for Thomas Garnier
<tipbot@zytor.com> wrote:
> Commit-ID:  021182e52fe01c1f7b126f97fd6ba048dc4234fd
> Gitweb:     http://git.kernel.org/tip/021182e52fe01c1f7b126f97fd6ba048dc4234fd
> Author:     Thomas Garnier <thgarnie@google.com>
> AuthorDate: Tue, 21 Jun 2016 17:47:03 -0700
> Committer:  Ingo Molnar <mingo@kernel.org>
> CommitDate: Fri, 8 Jul 2016 17:35:15 +0200
>
> x86/mm: Enable KASLR for physical mapping memory regions
>
> Add the physical mapping in the list of randomized memory regions.
>
> The physical memory mapping holds most allocations from boot and heap
> allocators. Knowing the base address and physical memory size, an attacker
> can deduce the PDE virtual address for the vDSO memory page. This attack
> was demonstrated at CanSecWest 2016, in the following presentation:
>
>   "Getting Physical: Extreme Abuse of Intel Based Paged Systems":
>   https://github.com/n3k/CansecWest2016_Getting_Physical_Extreme_Abuse_of_Intel_Based_Paging_Systems/blob/master/Presentation/CanSec2016_Presentation.pdf
>
> (See second part of the presentation).
>
> The exploits used against Linux worked successfully against 4.6+ but
> fail with KASLR memory enabled:
>
>   https://github.com/n3k/CansecWest2016_Getting_Physical_Extreme_Abuse_of_Intel_Based_Paging_Systems/tree/master/Demos/Linux/exploits
>
> Similar research was done at Google leading to this patch proposal.
>
> Variants exists to overwrite /proc or /sys objects ACLs leading to
> elevation of privileges. These variants were tested against 4.6+.
>
> The page offset used by the compressed kernel retains the static value
> since it is not yet randomized during this boot stage.

This patch is causing my system to fail to boot.  The last messages
that are printed before it hangs are:

[    0.195652] smpboot: CPU0: AMD Phenom(tm) II X6 1055T Processor
(family: 0x10, model: 0xa, stepping: 0x0)
[    0.195656] Performance Events: AMD PMU driver.
[    0.195659] ... version:                0
[    0.195660] ... bit width:              48
[    0.195660] ... generic registers:      4
[    0.195661] ... value mask:             0000ffffffffffff
[    0.195662] ... max period:             00007fffffffffff
[    0.195663] ... fixed-purpose events:   0
[    0.195664] ... event mask:             000000000000000f
[    0.196185] NMI watchdog: enabled on all CPUs, permanently consumes
one hw-PMU counter.
[    0.196291] x86: Booting SMP configuration:
[    0.196292] .... node  #0, CPUs:      #1

I'm taking a guess here, but it may be that this is interfering with
the APIC accesses.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [tip:x86/boot] x86/mm: Enable KASLR for physical mapping memory regions
  2016-08-14  4:25     ` Brian Gerst
@ 2016-08-14 23:26       ` Baoquan He
  2016-08-16 11:31         ` Brian Gerst
  0 siblings, 1 reply; 74+ messages in thread
From: Baoquan He @ 2016-08-14 23:26 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Borislav Petkov, Yinghai Lu, Juergen Gross, thgarnie,
	Andy Lutomirski, H. Peter Anvin, Alexander Kuleshov,
	Josh Poimboeuf, borntraeger, sds, kirill.shutemov, jroedel,
	Kees Cook, Williams, Dan J, Mark Salter, Borislav Petkov, corbet,
	matt, guangrong.xiao, aneesh.kumar, Ingo Molnar, Linus Torvalds,
	Dave Hansen, toshi.kani, alpopov, Linux Kernel Mailing List,
	Jan Beulich, Andrew Morton, Boris Ostrovsky, Denys Vlasenko,
	Peter Zijlstra, dyoung, Thomas Gleixner, Dmitry Vyukov, lv.zheng,
	schwidefsky, linux-tip-commits

On 08/14/16 at 12:25am, Brian Gerst wrote:
> On Fri, Jul 8, 2016 at 4:35 PM, tip-bot for Thomas Garnier
> <tipbot@zytor.com> wrote:
> > Commit-ID:  021182e52fe01c1f7b126f97fd6ba048dc4234fd
> > Gitweb:     http://git.kernel.org/tip/021182e52fe01c1f7b126f97fd6ba048dc4234fd
> > Author:     Thomas Garnier <thgarnie@google.com>
> > AuthorDate: Tue, 21 Jun 2016 17:47:03 -0700
> > Committer:  Ingo Molnar <mingo@kernel.org>
> > CommitDate: Fri, 8 Jul 2016 17:35:15 +0200
> >
> > x86/mm: Enable KASLR for physical mapping memory regions
> >
> > Add the physical mapping in the list of randomized memory regions.
> >
> > The physical memory mapping holds most allocations from boot and heap
> > allocators. Knowing the base address and physical memory size, an attacker
> > can deduce the PDE virtual address for the vDSO memory page. This attack
> > was demonstrated at CanSecWest 2016, in the following presentation:
> >
> >   "Getting Physical: Extreme Abuse of Intel Based Paged Systems":
> >   https://github.com/n3k/CansecWest2016_Getting_Physical_Extreme_Abuse_of_Intel_Based_Paging_Systems/blob/master/Presentation/CanSec2016_Presentation.pdf
> >
> > (See second part of the presentation).
> >
> > The exploits used against Linux worked successfully against 4.6+ but
> > fail with KASLR memory enabled:
> >
> >   https://github.com/n3k/CansecWest2016_Getting_Physical_Extreme_Abuse_of_Intel_Based_Paging_Systems/tree/master/Demos/Linux/exploits
> >
> > Similar research was done at Google leading to this patch proposal.
> >
> > Variants exists to overwrite /proc or /sys objects ACLs leading to
> > elevation of privileges. These variants were tested against 4.6+.
> >
> > The page offset used by the compressed kernel retains the static value
> > since it is not yet randomized during this boot stage.
> 
> This patch is causing my system to fail to boot.  The last messages
> that are printed before it hangs are:
> 
> [    0.195652] smpboot: CPU0: AMD Phenom(tm) II X6 1055T Processor
> (family: 0x10, model: 0xa, stepping: 0x0)
> [    0.195656] Performance Events: AMD PMU driver.
> [    0.195659] ... version:                0
> [    0.195660] ... bit width:              48
> [    0.195660] ... generic registers:      4
> [    0.195661] ... value mask:             0000ffffffffffff
> [    0.195662] ... max period:             00007fffffffffff
> [    0.195663] ... fixed-purpose events:   0
> [    0.195664] ... event mask:             000000000000000f
> [    0.196185] NMI watchdog: enabled on all CPUs, permanently consumes
> one hw-PMU counter.
> [    0.196291] x86: Booting SMP configuration:
> [    0.196292] .... node  #0, CPUs:      #1
> 
> I'm taking a guess here, but it may be that this is interfering with
> the APIC accesses.

Seems it hang when startup 2nd cpu. It may give more information if
add below line to the beginning of arch/x86/kernel/smpboot.c and
rebuild bzImage.

#define DEBUG

> 
> --
> Brian Gerst

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [tip:x86/boot] x86/mm: Enable KASLR for physical mapping memory regions
  2016-08-14 23:26       ` Baoquan He
@ 2016-08-16 11:31         ` Brian Gerst
  2016-08-16 13:42           ` Borislav Petkov
  0 siblings, 1 reply; 74+ messages in thread
From: Brian Gerst @ 2016-08-16 11:31 UTC (permalink / raw)
  To: Baoquan He
  Cc: Borislav Petkov, Yinghai Lu, Juergen Gross, Thomas Garnier,
	Andy Lutomirski, H. Peter Anvin, Alexander Kuleshov,
	Josh Poimboeuf, borntraeger, sds, kirill.shutemov, jroedel,
	Kees Cook, Williams, Dan J, Mark Salter, Borislav Petkov,
	Jonathan Corbet, matt, guangrong.xiao, aneesh.kumar, Ingo Molnar,
	Linus Torvalds, Dave Hansen, toshi.kani, alpopov,
	Linux Kernel Mailing List, Jan Beulich, Andrew Morton,
	Boris Ostrovsky, Denys Vlasenko, Peter Zijlstra, dyoung,
	Thomas Gleixner, Dmitry Vyukov, lv.zheng, schwidefsky,
	linux-tip-commits

On Sun, Aug 14, 2016 at 7:26 PM, Baoquan He <bhe@redhat.com> wrote:
> On 08/14/16 at 12:25am, Brian Gerst wrote:
>> On Fri, Jul 8, 2016 at 4:35 PM, tip-bot for Thomas Garnier
>> <tipbot@zytor.com> wrote:
>> > Commit-ID:  021182e52fe01c1f7b126f97fd6ba048dc4234fd
>> > Gitweb:     http://git.kernel.org/tip/021182e52fe01c1f7b126f97fd6ba048dc4234fd
>> > Author:     Thomas Garnier <thgarnie@google.com>
>> > AuthorDate: Tue, 21 Jun 2016 17:47:03 -0700
>> > Committer:  Ingo Molnar <mingo@kernel.org>
>> > CommitDate: Fri, 8 Jul 2016 17:35:15 +0200
>> >
>> > x86/mm: Enable KASLR for physical mapping memory regions
>> >
>> > Add the physical mapping in the list of randomized memory regions.
>> >
>> > The physical memory mapping holds most allocations from boot and heap
>> > allocators. Knowing the base address and physical memory size, an attacker
>> > can deduce the PDE virtual address for the vDSO memory page. This attack
>> > was demonstrated at CanSecWest 2016, in the following presentation:
>> >
>> >   "Getting Physical: Extreme Abuse of Intel Based Paged Systems":
>> >   https://github.com/n3k/CansecWest2016_Getting_Physical_Extreme_Abuse_of_Intel_Based_Paging_Systems/blob/master/Presentation/CanSec2016_Presentation.pdf
>> >
>> > (See second part of the presentation).
>> >
>> > The exploits used against Linux worked successfully against 4.6+ but
>> > fail with KASLR memory enabled:
>> >
>> >   https://github.com/n3k/CansecWest2016_Getting_Physical_Extreme_Abuse_of_Intel_Based_Paging_Systems/tree/master/Demos/Linux/exploits
>> >
>> > Similar research was done at Google leading to this patch proposal.
>> >
>> > Variants exists to overwrite /proc or /sys objects ACLs leading to
>> > elevation of privileges. These variants were tested against 4.6+.
>> >
>> > The page offset used by the compressed kernel retains the static value
>> > since it is not yet randomized during this boot stage.
>>
>> This patch is causing my system to fail to boot.  The last messages
>> that are printed before it hangs are:
>>
>> [    0.195652] smpboot: CPU0: AMD Phenom(tm) II X6 1055T Processor
>> (family: 0x10, model: 0xa, stepping: 0x0)
>> [    0.195656] Performance Events: AMD PMU driver.
>> [    0.195659] ... version:                0
>> [    0.195660] ... bit width:              48
>> [    0.195660] ... generic registers:      4
>> [    0.195661] ... value mask:             0000ffffffffffff
>> [    0.195662] ... max period:             00007fffffffffff
>> [    0.195663] ... fixed-purpose events:   0
>> [    0.195664] ... event mask:             000000000000000f
>> [    0.196185] NMI watchdog: enabled on all CPUs, permanently consumes
>> one hw-PMU counter.
>> [    0.196291] x86: Booting SMP configuration:
>> [    0.196292] .... node  #0, CPUs:      #1
>>
>> I'm taking a guess here, but it may be that this is interfering with
>> the APIC accesses.
>
> Seems it hang when startup 2nd cpu. It may give more information if
> add below line to the beginning of arch/x86/kernel/smpboot.c and
> rebuild bzImage.
>
> #define DEBUG

That didn't provide any useful information.  However, when I boot with
"nosmp", I do get an oops in load_microcode_amd().  I can't capture
the oops message (no serial console), but it's being called from
save_microcode_in_initrd_amd().

--
Brian Gerst

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [tip:x86/boot] x86/mm: Enable KASLR for physical mapping memory regions
  2016-08-16 11:31         ` Brian Gerst
@ 2016-08-16 13:42           ` Borislav Petkov
  2016-08-16 13:49             ` Borislav Petkov
  0 siblings, 1 reply; 74+ messages in thread
From: Borislav Petkov @ 2016-08-16 13:42 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Baoquan He, Yinghai Lu, Juergen Gross, Thomas Garnier,
	Andy Lutomirski, H. Peter Anvin, Alexander Kuleshov,
	Josh Poimboeuf, borntraeger, sds, kirill.shutemov, jroedel,
	Kees Cook, Williams, Dan J, Mark Salter, Borislav Petkov,
	Jonathan Corbet, matt, guangrong.xiao, aneesh.kumar, Ingo Molnar,
	Linus Torvalds, Dave Hansen, toshi.kani, alpopov,
	Linux Kernel Mailing List, Jan Beulich, Andrew Morton,
	Boris Ostrovsky, Denys Vlasenko, Peter Zijlstra, dyoung,
	Thomas Gleixner, Dmitry Vyukov, lv.zheng, schwidefsky,
	linux-tip-commits

On Tue, Aug 16, 2016 at 07:31:20AM -0400, Brian Gerst wrote:
> That didn't provide any useful information.  However, when I boot with
> "nosmp", I do get an oops in load_microcode_amd().  I can't capture
> the oops message (no serial console), but it's being called from
> save_microcode_in_initrd_amd().

That is possible. KASLR already broke microcode loading on Intel. :-\

I'll try to reproduce and fix this at some point but am away currently
so don't hold your breath. Does "dis_ucode_ldr" on the kernel cmdline
get you any further?

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [tip:x86/boot] x86/mm: Enable KASLR for physical mapping memory regions
  2016-08-16 13:42           ` Borislav Petkov
@ 2016-08-16 13:49             ` Borislav Petkov
  2016-08-16 15:54               ` Borislav Petkov
  0 siblings, 1 reply; 74+ messages in thread
From: Borislav Petkov @ 2016-08-16 13:49 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Baoquan He, Yinghai Lu, Juergen Gross, Thomas Garnier,
	Andy Lutomirski, H. Peter Anvin, Alexander Kuleshov,
	Josh Poimboeuf, borntraeger, sds, kirill.shutemov, jroedel,
	Kees Cook, Williams, Dan J, Mark Salter, Borislav Petkov,
	Jonathan Corbet, matt, guangrong.xiao, aneesh.kumar, Ingo Molnar,
	Linus Torvalds, Dave Hansen, toshi.kani, alpopov,
	Linux Kernel Mailing List, Jan Beulich, Andrew Morton,
	Boris Ostrovsky, Denys Vlasenko, Peter Zijlstra, dyoung,
	Thomas Gleixner, Dmitry Vyukov, lv.zheng, schwidefsky,
	linux-tip-commits

On Tue, Aug 16, 2016 at 03:42:05PM +0200, Borislav Petkov wrote:
> I'll try to reproduce and fix this at some point but am away currently
> so don't hold your breath. Does "dis_ucode_ldr" on the kernel cmdline
> get you any further?

Just a stab in the dark: does something like that help?

---
diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index 27a0228c9cae..2debaf119baf 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -434,6 +434,10 @@ int __init save_microcode_in_initrd_amd(void)
 	else
 		container = cont_va;
 
+#ifdef CONFIG_RANDOMIZE_MEMORY
+	container += PAGE_OFFSET - __PAGE_OFFSET_BASE;
+#endif
+
 	eax   = cpuid_eax(0x00000001);
 	eax   = ((eax >> 8) & 0xf) + ((eax >> 20) & 0xff);
 
---

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [tip:x86/boot] x86/mm: Enable KASLR for physical mapping memory regions
  2016-08-16 13:49             ` Borislav Petkov
@ 2016-08-16 15:54               ` Borislav Petkov
  2016-08-16 17:50                 ` Borislav Petkov
  0 siblings, 1 reply; 74+ messages in thread
From: Borislav Petkov @ 2016-08-16 15:54 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Baoquan He, Yinghai Lu, Juergen Gross, Thomas Garnier,
	Andy Lutomirski, H. Peter Anvin, Alexander Kuleshov,
	Josh Poimboeuf, borntraeger, sds, kirill.shutemov, jroedel,
	Kees Cook, Williams, Dan J, Mark Salter, Borislav Petkov,
	Jonathan Corbet, matt, guangrong.xiao, aneesh.kumar, Ingo Molnar,
	Linus Torvalds, Dave Hansen, toshi.kani, alpopov,
	Linux Kernel Mailing List, Jan Beulich, Andrew Morton,
	Boris Ostrovsky, Denys Vlasenko, Peter Zijlstra, dyoung,
	Thomas Gleixner, Dmitry Vyukov, lv.zheng, schwidefsky,
	linux-tip-commits

On Tue, Aug 16, 2016 at 03:49:28PM +0200, Borislav Petkov wrote:
> Just a stab in the dark: does something like that help?
> 
> ---
> diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
> index 27a0228c9cae..2debaf119baf 100644
> --- a/arch/x86/kernel/cpu/microcode/amd.c
> +++ b/arch/x86/kernel/cpu/microcode/amd.c
> @@ -434,6 +434,10 @@ int __init save_microcode_in_initrd_amd(void)
>  	else
>  		container = cont_va;
>  
> +#ifdef CONFIG_RANDOMIZE_MEMORY
> +	container += PAGE_OFFSET - __PAGE_OFFSET_BASE;
> +#endif
> +
>  	eax   = cpuid_eax(0x00000001);
>  	eax   = ((eax >> 8) & 0xf) + ((eax >> 20) & 0xff);
>  
> ---

Ok, I ran this in a guest and it finds the microcode patches properly.

My .config has:

CONFIG_ARCH_HAS_ELF_RANDOMIZE=y
CONFIG_RANDOMIZE_BASE=y
CONFIG_RANDOMIZE_MEMORY=y
CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0x0

When you run this, please check whether it really applies the microcode
on every core.

Thanks.

Without the above, I get:

[    0.432103] BUG: unable to handle kernel paging request at ffff88007fa5540c
[    0.436000] IP: [<ffffffffbc03ea8b>] load_microcode_amd+0x2b/0x3b0
[    0.436000] PGD 0 
[    0.436000] Oops: 0000 [#1] PREEMPT SMP
[    0.436000] Modules linked in:
[    0.436000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc1+ #16
[    0.436000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
[    0.436000] task: ffff9ebf7b460000 task.stack: ffff9ebf7bffc000
[    0.436000] RIP: 0010:[<ffffffffbc03ea8b>]  [<ffffffffbc03ea8b>] load_microcode_amd+0x2b/0x3b0
[    0.436000] RSP: 0018:ffff9ebf7bfffe08  EFLAGS: 00010246
[    0.436000] RAX: 0000000080000000 RBX: 0000000000000006 RCX: 0000000000001ec4
[    0.436000] RDX: ffff88007fa55408 RSI: 0000000000000015 RDI: 0000000000000000
[    0.436000] RBP: ffff9ebf7bfffe48 R08: ffff9ebf7b805960 R09: 0000000000000000
[    0.436000] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000001ec4
[    0.436000] R13: ffff88007fa55408 R14: 0000000000000015 R15: 0000000000001ec4
[    0.436000] FS:  0000000000000000(0000) GS:ffff9ebf7e800000(0000) knlGS:0000000000000000
[    0.436000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.436000] CR2: ffff88007fa5540c CR3: 0000000067c06000 CR4: 00000000000406f0
[    0.436000] Stack:
[    0.436000]  ffff9ebf7b805900 0000000000000246 0000000000000006 0000000000000006
[    0.436000]  0000000000000000 000000000000000f ffff88007fa55408 0000000000001ec4
[    0.436000]  ffff9ebf7bfffe80 ffffffffbcd49762 0000000000000000 00000000000000ee
[    0.436000] Call Trace:
[    0.436000]  [<ffffffffbcd49762>] save_microcode_in_initrd_amd+0xac/0xe0
[    0.436000]  [<ffffffffbcd49037>] ? microcode_init+0x1b1/0x1b1
[    0.436000]  [<ffffffffbcd49073>] save_microcode_in_initrd+0x3c/0x45
[    0.436000]  [<ffffffffbc000459>] do_one_initcall+0x59/0x190
[    0.436000]  [<ffffffffbc08ccc1>] ? parse_args+0x271/0x400
[    0.436000]  [<ffffffffbcd3d089>] kernel_init_freeable+0x118/0x19e
[    0.436000]  [<ffffffffbc76aa3e>] kernel_init+0xe/0x100
[    0.436000]  [<ffffffffbc7749ef>] ret_from_fork+0x1f/0x40
[    0.436000]  [<ffffffffbc76aa30>] ? rest_init+0x140/0x140
[    0.436000] Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 89 f6 41 55 49 89 d5 41 54 49 89 cc 53 48 83 ec 18 48 8b 3d d2 d9 e2 00 e8 e5 15 18 00 <41> 8b 4d 04 48 c7 05 be d9 e2 00 00 00 00 00 41 8b 5d 08 85 c9 
[    0.436000] RIP  [<ffffffffbc03ea8b>] load_microcode_amd+0x2b/0x3b0
[    0.436000]  RSP <ffff9ebf7bfffe08>
[    0.436000] CR2: ffff88007fa5540c
[    0.436000] ---[ end trace 21a612b6619d1c00 ]---
[    0.436019] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[    0.436019] 
[    0.438843] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[    0.438843]

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [tip:x86/boot] x86/mm: Enable KASLR for physical mapping memory regions
  2016-08-16 15:54               ` Borislav Petkov
@ 2016-08-16 17:50                 ` Borislav Petkov
  2016-08-16 19:49                   ` Kees Cook
  0 siblings, 1 reply; 74+ messages in thread
From: Borislav Petkov @ 2016-08-16 17:50 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Baoquan He, Yinghai Lu, Juergen Gross, Thomas Garnier,
	Andy Lutomirski, H. Peter Anvin, Alexander Kuleshov,
	Josh Poimboeuf, borntraeger, sds, kirill.shutemov, jroedel,
	Kees Cook, Williams, Dan J, Mark Salter, Borislav Petkov,
	Jonathan Corbet, matt, guangrong.xiao, aneesh.kumar, Ingo Molnar,
	Linus Torvalds, Dave Hansen, toshi.kani, alpopov,
	Linux Kernel Mailing List, Jan Beulich, Andrew Morton,
	Boris Ostrovsky, Denys Vlasenko, Peter Zijlstra, dyoung,
	Thomas Gleixner, Dmitry Vyukov, lv.zheng, schwidefsky,
	linux-tip-commits

On Tue, Aug 16, 2016 at 05:54:12PM +0200, Borislav Petkov wrote:
> Ok, I ran this in a guest and it finds the microcode patches properly.

Here's a better version to take care of the APs too:

---
diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index 27a0228c9cae..05242322e324 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -355,6 +355,7 @@ void load_ucode_amd_ap(void)
 	unsigned int cpu = smp_processor_id();
 	struct equiv_cpu_entry *eq;
 	struct microcode_amd *mc;
+	u8 *cont;
 	u32 rev, eax;
 	u16 eq_id;
 
@@ -371,8 +372,12 @@ void load_ucode_amd_ap(void)
 	if (check_current_patch_level(&rev, false))
 		return;
 
+#ifdef CONFIG_RANDOMIZE_MEMORY
+	cont = container + PAGE_OFFSET - __PAGE_OFFSET_BASE;
+#endif
+
 	eax = cpuid_eax(0x00000001);
-	eq  = (struct equiv_cpu_entry *)(container + CONTAINER_HDR_SZ);
+	eq  = (struct equiv_cpu_entry *)(cont + CONTAINER_HDR_SZ);
 
 	eq_id = find_equiv_id(eq, eax);
 	if (!eq_id)
@@ -434,6 +439,10 @@ int __init save_microcode_in_initrd_amd(void)
 	else
 		container = cont_va;
 
+#ifdef CONFIG_RANDOMIZE_MEMORY
+	container += PAGE_OFFSET - __PAGE_OFFSET_BASE;
+#endif
+
 	eax   = cpuid_eax(0x00000001);
 	eax   = ((eax >> 8) & 0xf) + ((eax >> 20) & 0xff);
 

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [tip:x86/boot] x86/mm: Enable KASLR for physical mapping memory regions
  2016-08-16 17:50                 ` Borislav Petkov
@ 2016-08-16 19:49                   ` Kees Cook
  2016-08-16 21:01                     ` Borislav Petkov
  0 siblings, 1 reply; 74+ messages in thread
From: Kees Cook @ 2016-08-16 19:49 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brian Gerst, Baoquan He, Yinghai Lu, Juergen Gross,
	Thomas Garnier, Andy Lutomirski, H. Peter Anvin,
	Alexander Kuleshov, Josh Poimboeuf, Christian Borntraeger,
	Stephen Smalley, Kirill A. Shutemov, Joerg Roedel, Williams,
	Dan J, Mark Salter, Borislav Petkov, Jonathan Corbet,
	Matt Fleming, Xiao Guangrong, Aneesh Kumar K.V, Ingo Molnar,
	Linus Torvalds, Dave Hansen, Toshi Kani, Alexander Popov,
	Linux Kernel Mailing List, Jan Beulich, Andrew Morton,
	Boris Ostrovsky, Denys Vlasenko, Peter Zijlstra, Dave Young,
	Thomas Gleixner, Dmitry Vyukov, Lv Zheng, Martin Schwidefsky,
	linux-tip-commits

On Tue, Aug 16, 2016 at 10:50 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Tue, Aug 16, 2016 at 05:54:12PM +0200, Borislav Petkov wrote:
>> Ok, I ran this in a guest and it finds the microcode patches properly.
>
> Here's a better version to take care of the APs too:
>
> ---
> diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
> index 27a0228c9cae..05242322e324 100644
> --- a/arch/x86/kernel/cpu/microcode/amd.c
> +++ b/arch/x86/kernel/cpu/microcode/amd.c
> @@ -355,6 +355,7 @@ void load_ucode_amd_ap(void)
>         unsigned int cpu = smp_processor_id();
>         struct equiv_cpu_entry *eq;
>         struct microcode_amd *mc;
> +       u8 *cont;
>         u32 rev, eax;
>         u16 eq_id;
>
> @@ -371,8 +372,12 @@ void load_ucode_amd_ap(void)
>         if (check_current_patch_level(&rev, false))
>                 return;
>
> +#ifdef CONFIG_RANDOMIZE_MEMORY
> +       cont = container + PAGE_OFFSET - __PAGE_OFFSET_BASE;
> +#endif
> +
>         eax = cpuid_eax(0x00000001);
> -       eq  = (struct equiv_cpu_entry *)(container + CONTAINER_HDR_SZ);
> +       eq  = (struct equiv_cpu_entry *)(cont + CONTAINER_HDR_SZ);

Am I misreading this? Shouldn't it be:

       cont = container;
#ifdef CONFIG_RANDOMIZE_MEMORY
       cont += PAGE_OFFSET - __PAGE_OFFSET_BASE;
#endif

(otherwise cont is undefined in the RANDOMIZE_MEMORY=n case?)

>
>         eq_id = find_equiv_id(eq, eax);
>         if (!eq_id)
> @@ -434,6 +439,10 @@ int __init save_microcode_in_initrd_amd(void)
>         else
>                 container = cont_va;
>
> +#ifdef CONFIG_RANDOMIZE_MEMORY
> +       container += PAGE_OFFSET - __PAGE_OFFSET_BASE;
> +#endif
> +
>         eax   = cpuid_eax(0x00000001);
>         eax   = ((eax >> 8) & 0xf) + ((eax >> 20) & 0xff);
>
>
> --
> Regards/Gruss,
>     Boris.
>
> ECO tip #101: Trim your mails when you reply.
> --


-Kees

-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [tip:x86/boot] x86/mm: Enable KASLR for physical mapping memory regions
  2016-08-16 19:49                   ` Kees Cook
@ 2016-08-16 21:01                     ` Borislav Petkov
  2016-08-17  0:31                       ` Brian Gerst
  0 siblings, 1 reply; 74+ messages in thread
From: Borislav Petkov @ 2016-08-16 21:01 UTC (permalink / raw)
  To: Kees Cook
  Cc: Brian Gerst, Baoquan He, Yinghai Lu, Juergen Gross,
	Thomas Garnier, Andy Lutomirski, H. Peter Anvin,
	Alexander Kuleshov, Josh Poimboeuf, Christian Borntraeger,
	Stephen Smalley, Kirill A. Shutemov, Joerg Roedel, Williams,
	Dan J, Mark Salter, Borislav Petkov, Jonathan Corbet,
	Matt Fleming, Xiao Guangrong, Aneesh Kumar K.V, Ingo Molnar,
	Linus Torvalds, Dave Hansen, Toshi Kani, Alexander Popov,
	Linux Kernel Mailing List, Jan Beulich, Andrew Morton,
	Boris Ostrovsky, Denys Vlasenko, Peter Zijlstra, Dave Young,
	Thomas Gleixner, Dmitry Vyukov, Lv Zheng, Martin Schwidefsky,
	linux-tip-commits

On Tue, Aug 16, 2016 at 12:49:52PM -0700, Kees Cook wrote:
> Am I misreading this?

No you're not.

> Shouldn't it be:
> 
>        cont = container;
> #ifdef CONFIG_RANDOMIZE_MEMORY
>        cont += PAGE_OFFSET - __PAGE_OFFSET_BASE;
> #endif
> 
> (otherwise cont is undefined in the RANDOMIZE_MEMORY=n case?)

Thanks for catching this Kees, I'll send a new version in the morning.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [tip:x86/boot] x86/mm: Enable KASLR for physical mapping memory regions
  2016-08-16 21:01                     ` Borislav Petkov
@ 2016-08-17  0:31                       ` Brian Gerst
  2016-08-17  9:11                         ` Borislav Petkov
  0 siblings, 1 reply; 74+ messages in thread
From: Brian Gerst @ 2016-08-17  0:31 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kees Cook, Baoquan He, Yinghai Lu, Juergen Gross, Thomas Garnier,
	Andy Lutomirski, H. Peter Anvin, Alexander Kuleshov,
	Josh Poimboeuf, Christian Borntraeger, Stephen Smalley,
	Kirill A. Shutemov, Joerg Roedel, Williams, Dan J, Mark Salter,
	Borislav Petkov, Jonathan Corbet, Matt Fleming, Xiao Guangrong,
	Aneesh Kumar K.V, Ingo Molnar, Linus Torvalds, Dave Hansen,
	Toshi Kani, Alexander Popov, Linux Kernel Mailing List,
	Jan Beulich, Andrew Morton, Boris Ostrovsky, Denys Vlasenko,
	Peter Zijlstra, Dave Young, Thomas Gleixner, Dmitry Vyukov,
	Lv Zheng, Martin Schwidefsky, linux-tip-commits

On Tue, Aug 16, 2016 at 5:01 PM, Borislav Petkov <bp@alien8.de> wrote:
> On Tue, Aug 16, 2016 at 12:49:52PM -0700, Kees Cook wrote:
>> Am I misreading this?
>
> No you're not.
>
>> Shouldn't it be:
>>
>>        cont = container;
>> #ifdef CONFIG_RANDOMIZE_MEMORY
>>        cont += PAGE_OFFSET - __PAGE_OFFSET_BASE;
>> #endif
>>
>> (otherwise cont is undefined in the RANDOMIZE_MEMORY=n case?)
>
> Thanks for catching this Kees, I'll send a new version in the morning.

These fixes work for my system.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [tip:x86/boot] x86/mm: Enable KASLR for physical mapping memory regions
  2016-08-17  0:31                       ` Brian Gerst
@ 2016-08-17  9:11                         ` Borislav Petkov
  2016-08-17 10:19                           ` Ingo Molnar
  0 siblings, 1 reply; 74+ messages in thread
From: Borislav Petkov @ 2016-08-17  9:11 UTC (permalink / raw)
  To: Brian Gerst, Ingo Molnar
  Cc: Kees Cook, Baoquan He, Yinghai Lu, Juergen Gross, Thomas Garnier,
	Andy Lutomirski, H. Peter Anvin, Alexander Kuleshov,
	Josh Poimboeuf, Christian Borntraeger, Stephen Smalley,
	Kirill A. Shutemov, Joerg Roedel, Williams, Dan J, Mark Salter,
	Borislav Petkov, Jonathan Corbet, Matt Fleming, Xiao Guangrong,
	Aneesh Kumar K.V, Linus Torvalds, Dave Hansen, Toshi Kani,
	Alexander Popov, Linux Kernel Mailing List, Jan Beulich,
	Andrew Morton, Boris Ostrovsky, Denys Vlasenko, Peter Zijlstra,
	Dave Young, Thomas Gleixner, Dmitry Vyukov, Lv Zheng,
	Martin Schwidefsky, linux-tip-commits

On Tue, Aug 16, 2016 at 08:31:28PM -0400, Brian Gerst wrote:
> These fixes work for my system.

Thanks Brian.

Ingo, please queue this into x86/urgent.

Thanks!

---
From: Borislav Petkov <bp@suse.de>
Date: Wed, 17 Aug 2016 08:23:29 +0200
Subject: [PATCH] x86/microcode/AMD: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y

Similar to

  efaad554b4ff ("x86/microcode/intel: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y")

fix microcode loading from the initrd on AMD by adding the randomization
offset to the microcode patch container within the initrd.

Reported-and-tested-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Kees Cook <keescook@chromium.org>
---
 arch/x86/kernel/cpu/microcode/amd.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index 27a0228c9cae..f43b774d0684 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -355,6 +355,7 @@ void load_ucode_amd_ap(void)
 	unsigned int cpu = smp_processor_id();
 	struct equiv_cpu_entry *eq;
 	struct microcode_amd *mc;
+	u8 *cont;
 	u32 rev, eax;
 	u16 eq_id;
 
@@ -371,8 +372,14 @@ void load_ucode_amd_ap(void)
 	if (check_current_patch_level(&rev, false))
 		return;
 
+	cont = container;
+
+#ifdef CONFIG_RANDOMIZE_MEMORY
+	cont += PAGE_OFFSET - __PAGE_OFFSET_BASE;
+#endif
+
 	eax = cpuid_eax(0x00000001);
-	eq  = (struct equiv_cpu_entry *)(container + CONTAINER_HDR_SZ);
+	eq  = (struct equiv_cpu_entry *)(cont + CONTAINER_HDR_SZ);
 
 	eq_id = find_equiv_id(eq, eax);
 	if (!eq_id)
@@ -434,6 +441,10 @@ int __init save_microcode_in_initrd_amd(void)
 	else
 		container = cont_va;
 
+#ifdef CONFIG_RANDOMIZE_MEMORY
+	container += PAGE_OFFSET - __PAGE_OFFSET_BASE;
+#endif
+
 	eax   = cpuid_eax(0x00000001);
 	eax   = ((eax >> 8) & 0xf) + ((eax >> 20) & 0xff);
 
-- 
2.8.4

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [tip:x86/boot] x86/mm: Enable KASLR for physical mapping memory regions
  2016-08-17  9:11                         ` Borislav Petkov
@ 2016-08-17 10:19                           ` Ingo Molnar
  2016-08-17 11:33                             ` Borislav Petkov
  0 siblings, 1 reply; 74+ messages in thread
From: Ingo Molnar @ 2016-08-17 10:19 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Brian Gerst, Kees Cook, Baoquan He, Yinghai Lu, Juergen Gross,
	Thomas Garnier, Andy Lutomirski, H. Peter Anvin,
	Alexander Kuleshov, Josh Poimboeuf, Christian Borntraeger,
	Stephen Smalley, Kirill A. Shutemov, Joerg Roedel, Williams,
	Dan J, Mark Salter, Borislav Petkov, Jonathan Corbet,
	Matt Fleming, Xiao Guangrong, Aneesh Kumar K.V, Linus Torvalds,
	Dave Hansen, Toshi Kani, Alexander Popov,
	Linux Kernel Mailing List, Jan Beulich, Andrew Morton,
	Boris Ostrovsky, Denys Vlasenko, Peter Zijlstra, Dave Young,
	Thomas Gleixner, Dmitry Vyukov, Lv Zheng, Martin Schwidefsky,
	linux-tip-commits


* Borislav Petkov <bp@alien8.de> wrote:

> On Tue, Aug 16, 2016 at 08:31:28PM -0400, Brian Gerst wrote:
> > These fixes work for my system.
> 
> Thanks Brian.
> 
> Ingo, please queue this into x86/urgent.
> 
> Thanks!
> 
> ---
> From: Borislav Petkov <bp@suse.de>
> Date: Wed, 17 Aug 2016 08:23:29 +0200
> Subject: [PATCH] x86/microcode/AMD: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y
> 
> Similar to
> 
>   efaad554b4ff ("x86/microcode/intel: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y")
> 
> fix microcode loading from the initrd on AMD by adding the randomization
> offset to the microcode patch container within the initrd.
> 
> Reported-and-tested-by: Brian Gerst <brgerst@gmail.com>
> Signed-off-by: Borislav Petkov <bp@suse.de>
> Cc: Kees Cook <keescook@chromium.org>
> ---
>  arch/x86/kernel/cpu/microcode/amd.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
> index 27a0228c9cae..f43b774d0684 100644
> --- a/arch/x86/kernel/cpu/microcode/amd.c
> +++ b/arch/x86/kernel/cpu/microcode/amd.c
> @@ -355,6 +355,7 @@ void load_ucode_amd_ap(void)
>  	unsigned int cpu = smp_processor_id();
>  	struct equiv_cpu_entry *eq;
>  	struct microcode_amd *mc;
> +	u8 *cont;
>  	u32 rev, eax;
>  	u16 eq_id;
>  
> @@ -371,8 +372,14 @@ void load_ucode_amd_ap(void)
>  	if (check_current_patch_level(&rev, false))
>  		return;
>  
> +	cont = container;
> +
> +#ifdef CONFIG_RANDOMIZE_MEMORY
> +	cont += PAGE_OFFSET - __PAGE_OFFSET_BASE;
> +#endif
> +
>  	eax = cpuid_eax(0x00000001);
> -	eq  = (struct equiv_cpu_entry *)(container + CONTAINER_HDR_SZ);
> +	eq  = (struct equiv_cpu_entry *)(cont + CONTAINER_HDR_SZ);
>  
>  	eq_id = find_equiv_id(eq, eax);
>  	if (!eq_id)
> @@ -434,6 +441,10 @@ int __init save_microcode_in_initrd_amd(void)
>  	else
>  		container = cont_va;
>  
> +#ifdef CONFIG_RANDOMIZE_MEMORY
> +	container += PAGE_OFFSET - __PAGE_OFFSET_BASE;
> +#endif
> +
>  	eax   = cpuid_eax(0x00000001);
>  	eax   = ((eax >> 8) & 0xf) + ((eax >> 20) & 0xff);

So I really hate this pattern, and we already have it in 
arch/x86/kernel/cpu/microcode/intel.c as well:

                start += PAGE_OFFSET - __PAGE_OFFSET_BASE;

and note that it's not #ifdefed there - I think it's safe to leave out the #ifdef?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [tip:x86/boot] x86/mm: Enable KASLR for physical mapping memory regions
  2016-08-17 10:19                           ` Ingo Molnar
@ 2016-08-17 11:33                             ` Borislav Petkov
  2016-08-18 10:49                               ` [tip:x86/urgent] x86/microcode/AMD: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y tip-bot for Borislav Petkov
  0 siblings, 1 reply; 74+ messages in thread
From: Borislav Petkov @ 2016-08-17 11:33 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Brian Gerst, Kees Cook, Baoquan He, Yinghai Lu, Juergen Gross,
	Thomas Garnier, Andy Lutomirski, H. Peter Anvin,
	Alexander Kuleshov, Josh Poimboeuf, Christian Borntraeger,
	Stephen Smalley, Kirill A. Shutemov, Joerg Roedel, Williams,
	Dan J, Mark Salter, Borislav Petkov, Jonathan Corbet,
	Matt Fleming, Xiao Guangrong, Aneesh Kumar K.V, Linus Torvalds,
	Dave Hansen, Toshi Kani, Alexander Popov,
	Linux Kernel Mailing List, Jan Beulich, Andrew Morton,
	Boris Ostrovsky, Denys Vlasenko, Peter Zijlstra, Dave Young,
	Thomas Gleixner, Dmitry Vyukov, Lv Zheng, Martin Schwidefsky,
	linux-tip-commits

On Wed, Aug 17, 2016 at 12:19:48PM +0200, Ingo Molnar wrote:
> So I really hate this pattern, and we already have it in 
> arch/x86/kernel/cpu/microcode/intel.c as well:
> 
>                 start += PAGE_OFFSET - __PAGE_OFFSET_BASE;
> 
> and note that it's not #ifdefed there - I think it's safe to leave out the #ifdef?

Ah, yes, we got rid of the ifdeffery in:

  4a1a8e1b8f9f ("x86/asm, x86/microcode: Add __PAGE_OFFSET_BASE define on 32-bit")

So here's v2:

---
From: Borislav Petkov <bp@suse.de>
Date: Wed, 17 Aug 2016 08:23:29 +0200
Subject: [PATCH] x86/microcode/AMD: Fix initrd loading with
 CONFIG_RANDOMIZE_MEMORY=y

Similar to

  efaad554b4ff ("x86/microcode/intel: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y")

fix microcode loading from the initrd on AMD by adding the randomization
offset to the microcode patch container within the initrd.

Reported-and-tested-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Kees Cook <keescook@chromium.org>
---
 arch/x86/kernel/cpu/microcode/amd.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index 27a0228c9cae..b816971f5da4 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -355,6 +355,7 @@ void load_ucode_amd_ap(void)
 	unsigned int cpu = smp_processor_id();
 	struct equiv_cpu_entry *eq;
 	struct microcode_amd *mc;
+	u8 *cont = container;
 	u32 rev, eax;
 	u16 eq_id;
 
@@ -371,8 +372,11 @@ void load_ucode_amd_ap(void)
 	if (check_current_patch_level(&rev, false))
 		return;
 
+	/* Add CONFIG_RANDOMIZE_MEMORY offset. */
+	cont += PAGE_OFFSET - __PAGE_OFFSET_BASE;
+
 	eax = cpuid_eax(0x00000001);
-	eq  = (struct equiv_cpu_entry *)(container + CONTAINER_HDR_SZ);
+	eq  = (struct equiv_cpu_entry *)(cont + CONTAINER_HDR_SZ);
 
 	eq_id = find_equiv_id(eq, eax);
 	if (!eq_id)
@@ -434,6 +438,9 @@ int __init save_microcode_in_initrd_amd(void)
 	else
 		container = cont_va;
 
+	/* Add CONFIG_RANDOMIZE_MEMORY offset. */
+	container += PAGE_OFFSET - __PAGE_OFFSET_BASE;
+
 	eax   = cpuid_eax(0x00000001);
 	eax   = ((eax >> 8) & 0xf) + ((eax >> 20) & 0xff);
 
-- 
2.8.4

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [tip:x86/urgent] x86/microcode/AMD: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y
  2016-08-17 11:33                             ` Borislav Petkov
@ 2016-08-18 10:49                               ` tip-bot for Borislav Petkov
  0 siblings, 0 replies; 74+ messages in thread
From: tip-bot for Borislav Petkov @ 2016-08-18 10:49 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: brgerst, bp, linux-kernel, bp, tglx, keescook, jpoimboe,
	dvlasenk, hpa, mingo, peterz, torvalds, luto

Commit-ID:  88b2f634028f1f38dcc3d412e10ff1f224976daa
Gitweb:     http://git.kernel.org/tip/88b2f634028f1f38dcc3d412e10ff1f224976daa
Author:     Borislav Petkov <bp@alien8.de>
AuthorDate: Wed, 17 Aug 2016 13:33:14 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 18 Aug 2016 10:06:49 +0200

x86/microcode/AMD: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y

Similar to:

  efaad554b4ff ("x86/microcode/intel: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y")

... fix microcode loading from the initrd on AMD by adding the
randomization offset to the microcode patch container within the initrd.

Reported-and-tested-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-tip-commits@vger.kernel.org
Link: http://lkml.kernel.org/r/20160817113314.GA19221@nazgul.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/kernel/cpu/microcode/amd.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index 27a0228..b816971 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -355,6 +355,7 @@ void load_ucode_amd_ap(void)
 	unsigned int cpu = smp_processor_id();
 	struct equiv_cpu_entry *eq;
 	struct microcode_amd *mc;
+	u8 *cont = container;
 	u32 rev, eax;
 	u16 eq_id;
 
@@ -371,8 +372,11 @@ void load_ucode_amd_ap(void)
 	if (check_current_patch_level(&rev, false))
 		return;
 
+	/* Add CONFIG_RANDOMIZE_MEMORY offset. */
+	cont += PAGE_OFFSET - __PAGE_OFFSET_BASE;
+
 	eax = cpuid_eax(0x00000001);
-	eq  = (struct equiv_cpu_entry *)(container + CONTAINER_HDR_SZ);
+	eq  = (struct equiv_cpu_entry *)(cont + CONTAINER_HDR_SZ);
 
 	eq_id = find_equiv_id(eq, eax);
 	if (!eq_id)
@@ -434,6 +438,9 @@ int __init save_microcode_in_initrd_amd(void)
 	else
 		container = cont_va;
 
+	/* Add CONFIG_RANDOMIZE_MEMORY offset. */
+	container += PAGE_OFFSET - __PAGE_OFFSET_BASE;
+
 	eax   = cpuid_eax(0x00000001);
 	eax   = ((eax >> 8) & 0xf) + ((eax >> 20) & 0xff);
 

^ permalink raw reply related	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2016-08-18 10:51 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-22  0:46 [PATCH v7 0/9] x86/mm: memory area address KASLR Kees Cook
2016-06-22  0:46 ` [kernel-hardening] " Kees Cook
2016-06-22  0:46 ` [PATCH v7 1/9] x86/mm: Refactor KASLR entropy functions Kees Cook
2016-06-22  0:46   ` [kernel-hardening] " Kees Cook
2016-07-08 20:33   ` [tip:x86/boot] " tip-bot for Thomas Garnier
2016-06-22  0:46 ` [PATCH v7 2/9] x86/mm: Update physical mapping variable names (x86_64) Kees Cook
2016-06-22  0:46   ` [kernel-hardening] " Kees Cook
2016-07-08 20:34   ` [tip:x86/boot] x86/mm: Update physical mapping variable names tip-bot for Thomas Garnier
2016-06-22  0:47 ` [PATCH v7 3/9] x86/mm: PUD VA support for physical mapping (x86_64) Kees Cook
2016-06-22  0:47   ` [kernel-hardening] " Kees Cook
2016-07-08 20:34   ` [tip:x86/boot] x86/mm: Add PUD VA support for physical mapping tip-bot for Thomas Garnier
2016-06-22  0:47 ` [PATCH v7 4/9] x86/mm: Separate variable for trampoline PGD (x86_64) Kees Cook
2016-06-22  0:47   ` [kernel-hardening] " Kees Cook
2016-07-08 20:35   ` [tip:x86/boot] x86/mm: Separate variable for trampoline PGD tip-bot for Thomas Garnier
2016-06-22  0:47 ` [PATCH v7 5/9] x86/mm: Implement ASLR for kernel memory regions (x86_64) Kees Cook
2016-06-22  0:47   ` [kernel-hardening] " Kees Cook
2016-07-08 20:35   ` [tip:x86/boot] x86/mm: Implement ASLR for kernel memory regions tip-bot for Thomas Garnier
2016-06-22  0:47 ` [PATCH v7 6/9] x86/mm: Enable KASLR for physical mapping memory region (x86_64) Kees Cook
2016-06-22  0:47   ` [kernel-hardening] " Kees Cook
2016-07-08 20:35   ` [tip:x86/boot] x86/mm: Enable KASLR for physical mapping memory regions tip-bot for Thomas Garnier
2016-08-14  4:25     ` Brian Gerst
2016-08-14 23:26       ` Baoquan He
2016-08-16 11:31         ` Brian Gerst
2016-08-16 13:42           ` Borislav Petkov
2016-08-16 13:49             ` Borislav Petkov
2016-08-16 15:54               ` Borislav Petkov
2016-08-16 17:50                 ` Borislav Petkov
2016-08-16 19:49                   ` Kees Cook
2016-08-16 21:01                     ` Borislav Petkov
2016-08-17  0:31                       ` Brian Gerst
2016-08-17  9:11                         ` Borislav Petkov
2016-08-17 10:19                           ` Ingo Molnar
2016-08-17 11:33                             ` Borislav Petkov
2016-08-18 10:49                               ` [tip:x86/urgent] x86/microcode/AMD: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y tip-bot for Borislav Petkov
2016-06-22  0:47 ` [PATCH v7 7/9] x86/mm: Enable KASLR for vmalloc memory region (x86_64) Kees Cook
2016-06-22  0:47   ` [kernel-hardening] " Kees Cook
2016-07-08 20:36   ` [tip:x86/boot] x86/mm: Enable KASLR for vmalloc memory regions tip-bot for Thomas Garnier
2016-06-22  0:47 ` [PATCH v7 8/9] x86/mm: Enable KASLR for vmemmap memory region (x86_64) Kees Cook
2016-06-22  0:47   ` [kernel-hardening] " Kees Cook
2016-06-22  0:47 ` [PATCH v7 9/9] x86/mm: Memory hotplug support for KASLR memory randomization (x86_64) Kees Cook
2016-06-22  0:47   ` [kernel-hardening] " Kees Cook
2016-07-08 20:36   ` [tip:x86/boot] x86/mm: Add memory hotplug support for KASLR memory randomization tip-bot for Thomas Garnier
2016-06-22 12:47 ` [kernel-hardening] [PATCH v7 0/9] x86/mm: memory area address KASLR Jason Cooper
2016-06-22 15:59   ` Thomas Garnier
2016-06-22 17:05     ` Kees Cook
2016-06-22 17:05       ` Kees Cook
2016-06-23 19:33       ` Jason Cooper
2016-06-23 19:33         ` Jason Cooper
2016-06-23 19:45         ` Sandy Harris
2016-06-23 19:59           ` Kees Cook
2016-06-23 19:59             ` Kees Cook
2016-06-23 20:19             ` Jason Cooper
2016-06-23 20:16           ` Jason Cooper
2016-06-23 19:58         ` Kees Cook
2016-06-23 19:58           ` Kees Cook
2016-06-23 20:05           ` Ard Biesheuvel
2016-06-23 20:05             ` Ard Biesheuvel
2016-06-24  1:11             ` Jason Cooper
2016-06-24  1:11               ` Jason Cooper
2016-06-24 10:54               ` Ard Biesheuvel
2016-06-24 10:54                 ` Ard Biesheuvel
2016-06-24 16:02                 ` devicetree random-seed properties, was: "Re: [PATCH v7 0/9] x86/mm: memory area address KASLR" Jason Cooper
2016-06-24 16:02                   ` [kernel-hardening] " Jason Cooper
2016-06-24 19:04                   ` Kees Cook
2016-06-24 19:04                     ` [kernel-hardening] " Kees Cook
2016-06-24 20:40                     ` Andy Lutomirski
2016-06-24 20:40                       ` [kernel-hardening] " Andy Lutomirski
2016-06-30 21:48                       ` Jason Cooper
2016-06-30 21:48                         ` [kernel-hardening] " Jason Cooper
2016-06-30 21:56                         ` Thomas Garnier
2016-06-30 21:48                     ` Jason Cooper
2016-06-30 21:48                       ` [kernel-hardening] " Jason Cooper
2016-07-07 22:24 ` [PATCH v7 0/9] x86/mm: memory area address KASLR Kees Cook
2016-07-07 22:24   ` [kernel-hardening] " Kees Cook

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.