linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v9 0/5] x86/KASLR: Randomize virtual address separately
@ 2016-05-25 22:45 Kees Cook
  2016-05-25 22:45 ` [PATCH v9 1/5] x86/boot: Refuse to build with data relocations Kees Cook
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: Kees Cook @ 2016-05-25 22:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Borislav Petkov, Baoquan He, Yinghai Lu,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, Andrew Morton,
	Josh Poimboeuf, Andrey Ryabinin, H.J. Lu, Dmitry Vyukov, LKML

This is v9 of the remaining patches needed to support separate phys/virt
KASLR of the text base address. The rest of the series has landed in -tip.

The patches are:
- 1: Best-effort data relocation detection for the build.
- 2: Further clean up on pagetable.c.
- 3: Last part of Baoquan's decoupling the physical address and virtual
     address randomization of kernel text.
- 4: Remove upper bound on physical address range.
- 5: Remove lower bound on physical address range.

Thanks!

-Kees

v9:
- added data relocation detection

v8:
- extracted identity map initialization function to be part of the
  called interface, renamed appropriately to initialize_identity_maps().
- added copyright to pagetable.c for clarity.
- shuffled initialization of mapping_info around again for good measure.
- refactored remaining patches to include call to initialize_identity_maps().

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v9 1/5] x86/boot: Refuse to build with data relocations
  2016-05-25 22:45 [PATCH v9 0/5] x86/KASLR: Randomize virtual address separately Kees Cook
@ 2016-05-25 22:45 ` Kees Cook
  2016-06-17 12:22   ` [tip:x86/boot] " tip-bot for Kees Cook
  2016-06-26 11:01   ` tip-bot for Kees Cook
  2016-05-25 22:45 ` [PATCH v9 2/5] x86/KASLR: Clarify identity map interface Kees Cook
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 21+ messages in thread
From: Kees Cook @ 2016-05-25 22:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Borislav Petkov, Baoquan He, Yinghai Lu,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, Andrew Morton,
	Josh Poimboeuf, Andrey Ryabinin, H.J. Lu, Dmitry Vyukov, LKML

The compressed kernel is built with -fPIC/-fPIE so that it can run in any
location a bootloader happens to put it. However, since ELF relocation
processing is not happening (and all the relocation information has
already been stripped at link time), none of the code can use data
relocations (e.g. static assignments of pointers). This is already noted
in a warning comment at the top of misc.c, but this adds an explicit
check for the condition during the linking stage to block any such bugs
from appearing.

If this was in place with the earlier bug in pagetable.c, the build
would fail like this:

  ...
    CC      arch/x86/boot/compressed/pagetable.o
    DATAREL arch/x86/boot/compressed/vmlinux
  error: arch/x86/boot/compressed/pagetable.o has data relocations!
  make[2]: *** [arch/x86/boot/compressed/vmlinux] Error 1
  ...

A clean build shows:

  ...
    CC      arch/x86/boot/compressed/pagetable.o
    DATAREL arch/x86/boot/compressed/vmlinux
    LD      arch/x86/boot/compressed/vmlinux
  ...

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/boot/compressed/Makefile | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index cfdd8c3f8af2..e69464792beb 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -85,7 +85,25 @@ vmlinux-objs-$(CONFIG_EFI_STUB) += $(obj)/eboot.o $(obj)/efi_stub_$(BITS).o \
 	$(objtree)/drivers/firmware/efi/libstub/lib.a
 vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o
 
+# The compressed kernel is built with -fPIC/-fPIE so that a boot loader
+# can place it anywhere in memory and it will still run. However, since
+# it is executed as-is without any ELF relocation processing performed
+# (and has already had all relocation sections stripped from the binary),
+# none of the code can use data relocations (e.g. static assignments of
+# pointer values), since they will be meaningless at runtime. This check
+# will refuse to link the vmlinux if any of these relocations are found.
+quiet_cmd_check_data_rel = DATAREL $@
+define cmd_check_data_rel
+	for obj in $(filter %.o,$^); do \
+		readelf -S $$obj | grep -qF .rel.local && { \
+			echo "error: $$obj has data relocations!" >&2; \
+			exit 1; \
+		} || true; \
+	done
+endef
+
 $(obj)/vmlinux: $(vmlinux-objs-y) FORCE
+	$(call if_changed,check_data_rel)
 	$(call if_changed,ld)
 	@:
 
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v9 2/5] x86/KASLR: Clarify identity map interface
  2016-05-25 22:45 [PATCH v9 0/5] x86/KASLR: Randomize virtual address separately Kees Cook
  2016-05-25 22:45 ` [PATCH v9 1/5] x86/boot: Refuse to build with data relocations Kees Cook
@ 2016-05-25 22:45 ` Kees Cook
  2016-06-17 12:22   ` [tip:x86/boot] " tip-bot for Kees Cook
  2016-06-26 11:02   ` tip-bot for Kees Cook
  2016-05-25 22:45 ` [PATCH v9 3/5] x86/KASLR: Randomize virtual address separately Kees Cook
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 21+ messages in thread
From: Kees Cook @ 2016-05-25 22:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Borislav Petkov, Baoquan He, Yinghai Lu,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, Andrew Morton,
	Josh Poimboeuf, Andrey Ryabinin, H.J. Lu, Dmitry Vyukov, LKML

This extracts the call to prepare_level4() into a top-level function
that the user of the pagetable.c interface must call to initialize
the new page tables. For clarity and to match the "finalize" function,
it has been renamed to initialize_identity_maps(). This function also
gains the initialization of mapping_info so we don't have to do it each
time in add_identity_map().

Additionally add copyright notice to the top, to make it clear that the
bulk of the pagetable.c code was written by Yinghai, and that I just
added bugs later. :)

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/boot/compressed/kaslr.c     |  3 +++
 arch/x86/boot/compressed/misc.h      |  3 +++
 arch/x86/boot/compressed/pagetable.c | 26 ++++++++++++++++----------
 3 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index cfeb0259ed81..03a6f5d85a6b 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -485,6 +485,9 @@ unsigned char *choose_random_location(unsigned long input,
 
 	boot_params->hdr.loadflags |= KASLR_FLAG;
 
+	/* Prepare to add new identity pagetables on demand. */
+	initialize_identity_maps();
+
 	/* Record the various known unsafe memory ranges. */
 	mem_avoid_init(input, input_size, output);
 
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index b6fec1ff10e4..09c4ddd02ac6 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -85,10 +85,13 @@ unsigned char *choose_random_location(unsigned long input_ptr,
 #endif
 
 #ifdef CONFIG_X86_64
+void initialize_identity_maps(void);
 void add_identity_map(unsigned long start, unsigned long size);
 void finalize_identity_maps(void);
 extern unsigned char _pgtable[];
 #else
+static inline void initialize_identity_maps(void)
+{ }
 static inline void add_identity_map(unsigned long start, unsigned long size)
 { }
 static inline void finalize_identity_maps(void)
diff --git a/arch/x86/boot/compressed/pagetable.c b/arch/x86/boot/compressed/pagetable.c
index 34b95df14e69..6e31a6aac4d3 100644
--- a/arch/x86/boot/compressed/pagetable.c
+++ b/arch/x86/boot/compressed/pagetable.c
@@ -2,6 +2,9 @@
  * This code is used on x86_64 to create page table identity mappings on
  * demand by building up a new set of page tables (or appending to the
  * existing ones), and then switching over to them when ready.
+ *
+ * Copyright (C) 2015-2016  Yinghai Lu
+ * Copyright (C)      2016  Kees Cook
  */
 
 /*
@@ -59,9 +62,21 @@ static struct alloc_pgt_data pgt_data;
 /* The top level page table entry pointer. */
 static unsigned long level4p;
 
+/*
+ * Mapping information structure passed to kernel_ident_mapping_init().
+ * Due to relocation, pointers must be assigned at run time not build time.
+ */
+static struct x86_mapping_info mapping_info = {
+	.pmd_flag       = __PAGE_KERNEL_LARGE_EXEC,
+};
+
 /* Locates and clears a region for a new top level page table. */
-static void prepare_level4(void)
+void initialize_identity_maps(void)
 {
+	/* Init mapping_info with run-time function/buffer pointers. */
+	mapping_info.alloc_pgt_page = alloc_pgt_page;
+	mapping_info.context = &pgt_data;
+
 	/*
 	 * It should be impossible for this not to already be true,
 	 * but since calling this a second time would rewind the other
@@ -96,17 +111,8 @@ static void prepare_level4(void)
  */
 void add_identity_map(unsigned long start, unsigned long size)
 {
-	struct x86_mapping_info mapping_info = {
-		.alloc_pgt_page	= alloc_pgt_page,
-		.context	= &pgt_data,
-		.pmd_flag	= __PAGE_KERNEL_LARGE_EXEC,
-	};
 	unsigned long end = start + size;
 
-	/* Make sure we have a top level page table ready to use. */
-	if (!level4p)
-		prepare_level4();
-
 	/* Align boundary to 2M. */
 	start = round_down(start, PMD_SIZE);
 	end = round_up(end, PMD_SIZE);
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v9 3/5] x86/KASLR: Randomize virtual address separately
  2016-05-25 22:45 [PATCH v9 0/5] x86/KASLR: Randomize virtual address separately Kees Cook
  2016-05-25 22:45 ` [PATCH v9 1/5] x86/boot: Refuse to build with data relocations Kees Cook
  2016-05-25 22:45 ` [PATCH v9 2/5] x86/KASLR: Clarify identity map interface Kees Cook
@ 2016-05-25 22:45 ` Kees Cook
  2016-06-17  8:20   ` Ingo Molnar
                     ` (2 more replies)
  2016-05-25 22:45 ` [PATCH v9 4/5] x86/KASLR: Add physical address randomization >4G Kees Cook
  2016-05-25 22:45 ` [PATCH v9 5/5] x86/KASLR: Allow randomization below load address Kees Cook
  4 siblings, 3 replies; 21+ messages in thread
From: Kees Cook @ 2016-05-25 22:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Baoquan He, Borislav Petkov, Yinghai Lu,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, Andrew Morton,
	Josh Poimboeuf, Andrey Ryabinin, H.J. Lu, Dmitry Vyukov, LKML

From: Baoquan He <bhe@redhat.com>

The current KASLR implementation randomizes the physical and virtual
addresses of the kernel together (both are offset by the same amount). It
calculates the delta of the physical address where vmlinux was linked
to load and where it is finally loaded. If the delta is not equal to 0
(i.e. the kernel was relocated), relocation handling needs be done.

On 64-bit, this patch randomizes both the physical address where kernel
is decompressed and the virtual address where kernel text is mapped and
will execute from. We now have two values being chosen, so the function
arguments are reorganized to pass by pointer so they can be directly
updated. Since relocation handling only depends on the virtual address,
we must check the virtual delta, not the physical delta for processing
kernel relocations. This also populates the page table for the new
virtual address range. 32-bit does not support a separate virtual address,
so it continues to use the physical offset for its virtual offset.

Additionally updates the sanity checks done on the resulting kernel
addresses since they are potentially separate now.

Signed-off-by: Baoquan He <bhe@redhat.com>
[kees: rewrote changelog, limited virtual split to 64-bit only, update checks]
[kees: fix CONFIG_RANDOMIZE_BASE=n boot failure]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/boot/compressed/kaslr.c | 44 ++++++++++++++++++++----------------
 arch/x86/boot/compressed/misc.c  | 49 ++++++++++++++++++++++++----------------
 arch/x86/boot/compressed/misc.h  | 22 ++++++++++--------
 3 files changed, 66 insertions(+), 49 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 03a6f5d85a6b..af92ea581b8e 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -463,23 +463,26 @@ static unsigned long find_random_virt_addr(unsigned long minimum,
  * Since this function examines addresses much more numerically,
  * it takes the input and output pointers as 'unsigned long'.
  */
-unsigned char *choose_random_location(unsigned long input,
-				      unsigned long input_size,
-				      unsigned long output,
-				      unsigned long output_size)
+void choose_random_location(unsigned long input,
+			    unsigned long input_size,
+			    unsigned long *output,
+			    unsigned long output_size,
+			    unsigned long *virt_addr)
 {
-	unsigned long choice = output;
 	unsigned long random_addr;
 
+	/* By default, keep output position unchanged. */
+	*virt_addr = *output;
+
 #ifdef CONFIG_HIBERNATION
 	if (!cmdline_find_option_bool("kaslr")) {
 		warn("KASLR disabled: 'kaslr' not on cmdline (hibernation selected).");
-		goto out;
+		return;
 	}
 #else
 	if (cmdline_find_option_bool("nokaslr")) {
 		warn("KASLR disabled: 'nokaslr' on cmdline.");
-		goto out;
+		return;
 	}
 #endif
 
@@ -489,25 +492,26 @@ unsigned char *choose_random_location(unsigned long input,
 	initialize_identity_maps();
 
 	/* Record the various known unsafe memory ranges. */
-	mem_avoid_init(input, input_size, output);
+	mem_avoid_init(input, input_size, *output);
 
 	/* Walk e820 and find a random address. */
-	random_addr = find_random_phys_addr(output, output_size);
+	random_addr = find_random_phys_addr(*output, output_size);
 	if (!random_addr) {
 		warn("KASLR disabled: could not find suitable E820 region!");
-		goto out;
+	} else {
+		/* Update the new physical address location. */
+		if (*output != random_addr) {
+			add_identity_map(random_addr, output_size);
+			*output = random_addr;
+		}
 	}
 
-	/* Always enforce the minimum. */
-	if (random_addr < choice)
-		goto out;
-
-	choice = random_addr;
-
-	add_identity_map(choice, output_size);
-
 	/* This actually loads the identity pagetable on x86_64. */
 	finalize_identity_maps();
-out:
-	return (unsigned char *)choice;
+
+	/* Pick random virtual address starting from LOAD_PHYSICAL_ADDR. */
+	if (IS_ENABLED(CONFIG_X86_64))
+		random_addr = find_random_virt_addr(LOAD_PHYSICAL_ADDR,
+						 output_size);
+	*virt_addr = random_addr;
 }
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index f14db4e21654..b3c5a5f030ce 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -170,7 +170,8 @@ void __puthex(unsigned long value)
 }
 
 #if CONFIG_X86_NEED_RELOCS
-static void handle_relocations(void *output, unsigned long output_len)
+static void handle_relocations(void *output, unsigned long output_len,
+			       unsigned long virt_addr)
 {
 	int *reloc;
 	unsigned long delta, map, ptr;
@@ -182,11 +183,6 @@ static void handle_relocations(void *output, unsigned long output_len)
 	 * and where it was actually loaded.
 	 */
 	delta = min_addr - LOAD_PHYSICAL_ADDR;
-	if (!delta) {
-		debug_putstr("No relocation needed... ");
-		return;
-	}
-	debug_putstr("Performing relocations... ");
 
 	/*
 	 * The kernel contains a table of relocation addresses. Those
@@ -198,6 +194,20 @@ static void handle_relocations(void *output, unsigned long output_len)
 	map = delta - __START_KERNEL_map;
 
 	/*
+	 * 32-bit always performs relocations. 64-bit relocations are only
+	 * needed if KASLR has chosen a different starting address offset
+	 * from __START_KERNEL_map.
+	 */
+	if (IS_ENABLED(CONFIG_X86_64))
+		delta = virt_addr - LOAD_PHYSICAL_ADDR;
+
+	if (!delta) {
+		debug_putstr("No relocation needed... ");
+		return;
+	}
+	debug_putstr("Performing relocations... ");
+
+	/*
 	 * Process relocations: 32 bit relocations first then 64 bit after.
 	 * Three sets of binary relocations are added to the end of the kernel
 	 * before compression. Each relocation table entry is the kernel
@@ -250,7 +260,8 @@ static void handle_relocations(void *output, unsigned long output_len)
 #endif
 }
 #else
-static inline void handle_relocations(void *output, unsigned long output_len)
+static inline void handle_relocations(void *output, unsigned long output_len,
+				      unsigned long virt_addr)
 { }
 #endif
 
@@ -327,7 +338,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
 				  unsigned long output_len)
 {
 	const unsigned long kernel_total_size = VO__end - VO__text;
-	unsigned char *output_orig = output;
+	unsigned long virt_addr = (unsigned long)output;
 
 	/* Retain x86 boot parameters pointer passed from startup_32/64. */
 	boot_params = rmode;
@@ -366,13 +377,16 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
 	 * the entire decompressed kernel plus relocation table, or the
 	 * entire decompressed kernel plus .bss and .brk sections.
 	 */
-	output = choose_random_location((unsigned long)input_data, input_len,
-					(unsigned long)output,
-					max(output_len, kernel_total_size));
+	choose_random_location((unsigned long)input_data, input_len,
+				(unsigned long *)&output,
+				max(output_len, kernel_total_size),
+				&virt_addr);
 
 	/* Validate memory location choices. */
 	if ((unsigned long)output & (MIN_KERNEL_ALIGN - 1))
-		error("Destination address inappropriately aligned");
+		error("Destination physical address inappropriately aligned");
+	if (virt_addr & (MIN_KERNEL_ALIGN - 1))
+		error("Destination virtual address inappropriately aligned");
 #ifdef CONFIG_X86_64
 	if (heap > 0x3fffffffffffUL)
 		error("Destination address too large");
@@ -382,19 +396,16 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
 #endif
 #ifndef CONFIG_RELOCATABLE
 	if ((unsigned long)output != LOAD_PHYSICAL_ADDR)
-		error("Wrong destination address");
+		error("Destination address does not match LOAD_PHYSICAL_ADDR");
+	if ((unsigned long)output != virt_addr)
+		error("Destination virtual address changed when not relocatable");
 #endif
 
 	debug_putstr("\nDecompressing Linux... ");
 	__decompress(input_data, input_len, NULL, NULL, output, output_len,
 			NULL, error);
 	parse_elf(output);
-	/*
-	 * 32-bit always performs relocations. 64-bit relocations are only
-	 * needed if kASLR has chosen a different load address.
-	 */
-	if (!IS_ENABLED(CONFIG_X86_64) || output != output_orig)
-		handle_relocations(output, output_len);
+	handle_relocations(output, output_len, virt_addr);
 	debug_putstr("done.\nBooting the kernel.\n");
 	return output;
 }
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 09c4ddd02ac6..1c8355eadbd1 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -67,20 +67,22 @@ int cmdline_find_option_bool(const char *option);
 
 #if CONFIG_RANDOMIZE_BASE
 /* kaslr.c */
-unsigned char *choose_random_location(unsigned long input_ptr,
-				      unsigned long input_size,
-				      unsigned long output_ptr,
-				      unsigned long output_size);
+void choose_random_location(unsigned long input,
+			    unsigned long input_size,
+			    unsigned long *output,
+			    unsigned long output_size,
+			    unsigned long *virt_addr);
 /* cpuflags.c */
 bool has_cpuflag(int flag);
 #else
-static inline
-unsigned char *choose_random_location(unsigned long input_ptr,
-				      unsigned long input_size,
-				      unsigned long output_ptr,
-				      unsigned long output_size)
+static inline void choose_random_location(unsigned long input,
+					  unsigned long input_size,
+					  unsigned long *output,
+					  unsigned long output_size,
+					  unsigned long *virt_addr)
 {
-	return (unsigned char *)output_ptr;
+	/* No change from existing output location. */
+	*virt_addr = *output;
 }
 #endif
 
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v9 4/5] x86/KASLR: Add physical address randomization >4G
  2016-05-25 22:45 [PATCH v9 0/5] x86/KASLR: Randomize virtual address separately Kees Cook
                   ` (2 preceding siblings ...)
  2016-05-25 22:45 ` [PATCH v9 3/5] x86/KASLR: Randomize virtual address separately Kees Cook
@ 2016-05-25 22:45 ` Kees Cook
  2016-06-17 12:23   ` [tip:x86/boot] x86/KASLR: Extend kernel image physical address randomization to addresses larger than 4G tip-bot for Kees Cook
  2016-06-26 11:02   ` tip-bot for Kees Cook
  2016-05-25 22:45 ` [PATCH v9 5/5] x86/KASLR: Allow randomization below load address Kees Cook
  4 siblings, 2 replies; 21+ messages in thread
From: Kees Cook @ 2016-05-25 22:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Baoquan He, Borislav Petkov, Yinghai Lu,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, Andrew Morton,
	Josh Poimboeuf, Andrey Ryabinin, H.J. Lu, Dmitry Vyukov, LKML

This patch exchanges the prior slots[] array for the new slot_areas[]
array, and lifts the limitation of KERNEL_IMAGE_SIZE on the physical
address offset for 64-bit. As before, process_e820_entry() walks
memory and populates slot_areas[], splitting on any detected mem_avoid
collisions.

Finally, since the slots[] array and its associated functions are not
needed any more, so they are removed.

Based on earlier patches by Baoquan He.

Cc: Baoquan He <bhe@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
This patch is pretty noisy due to the indentation change in the
e820 walker. I couldn't find a cleaner way to do this that didn't
make the final code LESS readable, unfortunately. So, the diff is
ugly, but I think the results are clean.
---
 arch/x86/Kconfig                 |  27 +++++----
 arch/x86/boot/compressed/kaslr.c | 115 +++++++++++++++++++++++----------------
 2 files changed, 85 insertions(+), 57 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 0a7b885964ba..770ae5259dff 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1934,21 +1934,26 @@ config RANDOMIZE_BASE
 	  attempts relying on knowledge of the location of kernel
 	  code internals.
 
-	  The kernel physical and virtual address can be randomized
-	  from 16MB up to 1GB on 64-bit and 512MB on 32-bit. (Note that
-	  using RANDOMIZE_BASE reduces the memory space available to
-	  kernel modules from 1.5GB to 1GB.)
+	  On 64-bit, the kernel physical and virtual addresses are
+	  randomized separately. The physical address will be anywhere
+	  between 16MB and the top of physical memory (up to 64TB). The
+	  virtual address will be randomized from 16MB up to 1GB (9 bits
+	  of entropy). Note that this also reduces the memory space
+	  available to kernel modules from 1.5GB to 1GB.
+
+	  On 32-bit, the kernel physical and virtual addresses are
+	  randomized together. They will be randomized from 16MB up to
+	  512MB (8 bits of entropy).
 
 	  Entropy is generated using the RDRAND instruction if it is
 	  supported. If RDTSC is supported, its value is mixed into
 	  the entropy pool as well. If neither RDRAND nor RDTSC are
-	  supported, then entropy is read from the i8254 timer.
-
-	  Since the kernel is built using 2GB addressing, and
-	  PHYSICAL_ALIGN must be at a minimum of 2MB, only 10 bits of
-	  entropy is theoretically possible. Currently, with the
-	  default value for PHYSICAL_ALIGN and due to page table
-	  layouts, 64-bit uses 9 bits of entropy and 32-bit uses 8 bits.
+	  supported, then entropy is read from the i8254 timer. The
+	  usable entropy is limited by the kernel being built using
+	  2GB addressing, and that PHYSICAL_ALIGN must be at a
+	  minimum of 2MB. As a result, only 10 bits of entropy are
+	  theoretically possible, but the implementations are further
+	  limited due to memory layouts.
 
 	  If CONFIG_HIBERNATE is also enabled, KASLR is disabled at boot
 	  time. To enable it, boot with "kaslr" on the kernel command
diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index af92ea581b8e..d0a823df183b 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -132,17 +132,6 @@ enum mem_avoid_index {
 
 static struct mem_vector mem_avoid[MEM_AVOID_MAX];
 
-static bool mem_contains(struct mem_vector *region, struct mem_vector *item)
-{
-	/* Item at least partially before region. */
-	if (item->start < region->start)
-		return false;
-	/* Item at least partially after region. */
-	if (item->start + item->size > region->start + region->size)
-		return false;
-	return true;
-}
-
 static bool mem_overlaps(struct mem_vector *one, struct mem_vector *two)
 {
 	/* Item one is entirely before item two. */
@@ -319,8 +308,6 @@ static bool mem_avoid_overlap(struct mem_vector *img,
 	return is_overlapping;
 }
 
-static unsigned long slots[KERNEL_IMAGE_SIZE / CONFIG_PHYSICAL_ALIGN];
-
 struct slot_area {
 	unsigned long addr;
 	int num;
@@ -351,36 +338,44 @@ static void store_slot_info(struct mem_vector *region, unsigned long image_size)
 	}
 }
 
-static void slots_append(unsigned long addr)
-{
-	/* Overflowing the slots list should be impossible. */
-	if (slot_max >= KERNEL_IMAGE_SIZE / CONFIG_PHYSICAL_ALIGN)
-		return;
-
-	slots[slot_max++] = addr;
-}
-
 static unsigned long slots_fetch_random(void)
 {
+	unsigned long slot;
+	int i;
+
 	/* Handle case of no slots stored. */
 	if (slot_max == 0)
 		return 0;
 
-	return slots[get_random_long("Physical") % slot_max];
+	slot = get_random_long("Physical") % slot_max;
+
+	for (i = 0; i < slot_area_index; i++) {
+		if (slot >= slot_areas[i].num) {
+			slot -= slot_areas[i].num;
+			continue;
+		}
+		return slot_areas[i].addr + slot * CONFIG_PHYSICAL_ALIGN;
+	}
+
+	if (i == slot_area_index)
+		debug_putstr("slots_fetch_random() failed!?\n");
+	return 0;
 }
 
 static void process_e820_entry(struct e820entry *entry,
 			       unsigned long minimum,
 			       unsigned long image_size)
 {
-	struct mem_vector region, img, overlap;
+	struct mem_vector region, overlap;
+	struct slot_area slot_area;
+	unsigned long start_orig;
 
 	/* Skip non-RAM entries. */
 	if (entry->type != E820_RAM)
 		return;
 
-	/* Ignore entries entirely above our maximum. */
-	if (entry->addr >= KERNEL_IMAGE_SIZE)
+	/* On 32-bit, ignore entries entirely above our maximum. */
+	if (IS_ENABLED(CONFIG_X86_32) && entry->addr >= KERNEL_IMAGE_SIZE)
 		return;
 
 	/* Ignore entries entirely below our minimum. */
@@ -390,31 +385,55 @@ static void process_e820_entry(struct e820entry *entry,
 	region.start = entry->addr;
 	region.size = entry->size;
 
-	/* Potentially raise address to minimum location. */
-	if (region.start < minimum)
-		region.start = minimum;
+	/* Give up if slot area array is full. */
+	while (slot_area_index < MAX_SLOT_AREA) {
+		start_orig = region.start;
 
-	/* Potentially raise address to meet alignment requirements. */
-	region.start = ALIGN(region.start, CONFIG_PHYSICAL_ALIGN);
+		/* Potentially raise address to minimum location. */
+		if (region.start < minimum)
+			region.start = minimum;
 
-	/* Did we raise the address above the bounds of this e820 region? */
-	if (region.start > entry->addr + entry->size)
-		return;
+		/* Potentially raise address to meet alignment needs. */
+		region.start = ALIGN(region.start, CONFIG_PHYSICAL_ALIGN);
 
-	/* Reduce size by any delta from the original address. */
-	region.size -= region.start - entry->addr;
+		/* Did we raise the address above this e820 region? */
+		if (region.start > entry->addr + entry->size)
+			return;
 
-	/* Reduce maximum size to fit end of image within maximum limit. */
-	if (region.start + region.size > KERNEL_IMAGE_SIZE)
-		region.size = KERNEL_IMAGE_SIZE - region.start;
+		/* Reduce size by any delta from the original address. */
+		region.size -= region.start - start_orig;
 
-	/* Walk each aligned slot and check for avoided areas. */
-	for (img.start = region.start, img.size = image_size ;
-	     mem_contains(&region, &img) ;
-	     img.start += CONFIG_PHYSICAL_ALIGN) {
-		if (mem_avoid_overlap(&img, &overlap))
-			continue;
-		slots_append(img.start);
+		/* On 32-bit, reduce region size to fit within max size. */
+		if (IS_ENABLED(CONFIG_X86_32) &&
+		    region.start + region.size > KERNEL_IMAGE_SIZE)
+			region.size = KERNEL_IMAGE_SIZE - region.start;
+
+		/* Return if region can't contain decompressed kernel */
+		if (region.size < image_size)
+			return;
+
+		/* If nothing overlaps, store the region and return. */
+		if (!mem_avoid_overlap(&region, &overlap)) {
+			store_slot_info(&region, image_size);
+			return;
+		}
+
+		/* Store beginning of region if holds at least image_size. */
+		if (overlap.start > region.start + image_size) {
+			struct mem_vector beginning;
+
+			beginning.start = region.start;
+			beginning.size = overlap.start - region.start;
+			store_slot_info(&beginning, image_size);
+		}
+
+		/* Return if overlap extends to or past end of region. */
+		if (overlap.start + overlap.size >= region.start + region.size)
+			return;
+
+		/* Clip off the overlapping region and start over. */
+		region.size -= overlap.start - region.start + overlap.size;
+		region.start = overlap.start + overlap.size;
 	}
 }
 
@@ -431,6 +450,10 @@ static unsigned long find_random_phys_addr(unsigned long minimum,
 	for (i = 0; i < boot_params->e820_entries; i++) {
 		process_e820_entry(&boot_params->e820_map[i], minimum,
 				   image_size);
+		if (slot_area_index == MAX_SLOT_AREA) {
+			debug_putstr("Aborted e820 scan (slot_areas full)!\n");
+			break;
+		}
 	}
 
 	return slots_fetch_random();
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v9 5/5] x86/KASLR: Allow randomization below load address
  2016-05-25 22:45 [PATCH v9 0/5] x86/KASLR: Randomize virtual address separately Kees Cook
                   ` (3 preceding siblings ...)
  2016-05-25 22:45 ` [PATCH v9 4/5] x86/KASLR: Add physical address randomization >4G Kees Cook
@ 2016-05-25 22:45 ` Kees Cook
  2016-06-17  8:47   ` Ingo Molnar
                     ` (2 more replies)
  4 siblings, 3 replies; 21+ messages in thread
From: Kees Cook @ 2016-05-25 22:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Yinghai Lu, Borislav Petkov, Baoquan He,
	H. Peter Anvin, Thomas Gleixner, Ingo Molnar, x86, Andrew Morton,
	Josh Poimboeuf, Andrey Ryabinin, H.J. Lu, Dmitry Vyukov, LKML

From: Yinghai Lu <yinghai@kernel.org>

Currently the physical randomization's lower boundary is the original
kernel load address. For bootloaders that load kernels into very high
memory (e.g. kexec), this means randomization takes place in a very small
window at the top of memory, ignoring the large region of physical memory
below the load address.

Since mem_avoid is already correctly tracking the regions that must be
avoided, this patch changes the minimum address to whatever is less:
512M (to conservatively avoid unknown things in lower memory) or the
load address. Now, for example, if the kernel is loaded at 8G, [512M,
8G) will be added into possible physical memory positions.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
[kees: rewrote changelog, refactor to use min()]
Signed-off-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/boot/compressed/kaslr.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index d0a823df183b..304c5c369aff 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -492,7 +492,7 @@ void choose_random_location(unsigned long input,
 			    unsigned long output_size,
 			    unsigned long *virt_addr)
 {
-	unsigned long random_addr;
+	unsigned long random_addr, min_addr;
 
 	/* By default, keep output position unchanged. */
 	*virt_addr = *output;
@@ -517,8 +517,11 @@ void choose_random_location(unsigned long input,
 	/* Record the various known unsafe memory ranges. */
 	mem_avoid_init(input, input_size, *output);
 
+	/* Low end should be the smaller of 512M or initial location. */
+	min_addr = min(*output, 512UL << 20);
+
 	/* Walk e820 and find a random address. */
-	random_addr = find_random_phys_addr(*output, output_size);
+	random_addr = find_random_phys_addr(min_addr, output_size);
 	if (!random_addr) {
 		warn("KASLR disabled: could not find suitable E820 region!");
 	} else {
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v9 3/5] x86/KASLR: Randomize virtual address separately
  2016-05-25 22:45 ` [PATCH v9 3/5] x86/KASLR: Randomize virtual address separately Kees Cook
@ 2016-06-17  8:20   ` Ingo Molnar
  2016-06-17  8:35     ` Ingo Molnar
  2016-06-17 12:22   ` [tip:x86/boot] " tip-bot for Baoquan He
  2016-06-26 11:02   ` tip-bot for Baoquan He
  2 siblings, 1 reply; 21+ messages in thread
From: Ingo Molnar @ 2016-06-17  8:20 UTC (permalink / raw)
  To: Kees Cook
  Cc: Baoquan He, Borislav Petkov, Yinghai Lu, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, x86, Andrew Morton, Josh Poimboeuf,
	Andrey Ryabinin, H.J. Lu, Dmitry Vyukov, LKML


* Kees Cook <keescook@chromium.org> wrote:

> -unsigned char *choose_random_location(unsigned long input,
> -				      unsigned long input_size,
> -				      unsigned long output,
> -				      unsigned long output_size)
> +void choose_random_location(unsigned long input,
> +			    unsigned long input_size,
> +			    unsigned long *output,
> +			    unsigned long output_size,
> +			    unsigned long *virt_addr)
>  {
> -	unsigned long choice = output;
>  	unsigned long random_addr;
>  
> +	/* By default, keep output position unchanged. */
> +	*virt_addr = *output;

So I applied this, after fixing a conflict with a recent hibernation related 
change, but it would be nice to further clean up the types in this file, in 
particular could we please propagate 'const' for all input-only pointers?

For example in the above function it would be obvious at a glance if it said 
something like:

 void choose_random_location(unsigned long input,
			    unsigned long input_size,
			    const unsigned long *output,
			    unsigned long output_size,
			    unsigned long *virt_addr)

when reading such a function prototype I can immediately tell: 'yeah, while it's 
named "output", it's in fact a read-only input parameter - the _real_ output of 
the function is 'virt_addr'.)

In addition to that it would also be useful to eliminate the 'virt_addr' parameter 
altogether, and use an 'unsigned long' return value to set virt_addr in misc.c.

Ok?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9 3/5] x86/KASLR: Randomize virtual address separately
  2016-06-17  8:20   ` Ingo Molnar
@ 2016-06-17  8:35     ` Ingo Molnar
  0 siblings, 0 replies; 21+ messages in thread
From: Ingo Molnar @ 2016-06-17  8:35 UTC (permalink / raw)
  To: Kees Cook
  Cc: Baoquan He, Borislav Petkov, Yinghai Lu, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, x86, Andrew Morton, Josh Poimboeuf,
	Andrey Ryabinin, H.J. Lu, Dmitry Vyukov, LKML


* Ingo Molnar <mingo@kernel.org> wrote:

> 
> * Kees Cook <keescook@chromium.org> wrote:
> 
> > -unsigned char *choose_random_location(unsigned long input,
> > -				      unsigned long input_size,
> > -				      unsigned long output,
> > -				      unsigned long output_size)
> > +void choose_random_location(unsigned long input,
> > +			    unsigned long input_size,
> > +			    unsigned long *output,
> > +			    unsigned long output_size,
> > +			    unsigned long *virt_addr)
> >  {
> > -	unsigned long choice = output;
> >  	unsigned long random_addr;
> >  
> > +	/* By default, keep output position unchanged. */
> > +	*virt_addr = *output;
> 
> So I applied this, after fixing a conflict with a recent hibernation related 
> change, but it would be nice to further clean up the types in this file, in 
> particular could we please propagate 'const' for all input-only pointers?
> 
> For example in the above function it would be obvious at a glance if it said 
> something like:
> 
>  void choose_random_location(unsigned long input,
> 			    unsigned long input_size,
> 			    const unsigned long *output,
> 			    unsigned long output_size,
> 			    unsigned long *virt_addr)
> 
> when reading such a function prototype I can immediately tell: 'yeah, while it's 
> named "output", it's in fact a read-only input parameter - the _real_ output of 
> the function is 'virt_addr'.)

Doh, so I managed to confuse myself by looking at the unpatched function only. 
This patch in fact starts writing to 'output':

+               /* Update the new physical address location. */
+               if (*output != random_addr) {
+                       add_identity_map(random_addr, output_size);
+                       *output = random_addr;
+               }

At which point 'output' cannot be const, and in fact it might be beneficial that 
'virt_addr' is passed in by a pointer as well.

The comment of the function definitely needs to be updated:

  * it takes the input and output pointers as 'unsigned long'.

... which is not true anymore.

I also find the type flow and naming for the 'output' pointer very confusing. We 
have:

asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
                                  unsigned char *input_data,
                                  unsigned long input_len,
                                  unsigned char *output,
                                  unsigned long output_len)

...

        choose_random_location((unsigned long)input_data, input_len,
                                (unsigned long *)&output,
                                max(output_len, kernel_total_size),
                                &virt_addr);


void choose_random_location(unsigned long input,
                            unsigned long input_size,
                            unsigned long *output,
                            unsigned long output_size,
                            unsigned long *virt_addr)

...

                        *output = random_addr;


it is very easy to confuse 'unsigned long *output' with the 'char *output' pointer 
to the output stream! But in reality this is a double pointer and we want to use 
it to change the pointer.

So at minimum we should rename 'output' in choose_random_location() to something 
like 'output_ptr' - but even better would be to just preserve its natural type and 
use 'char * const *' and do a single type cast when setting it.

Same goes for 'virt_addr'.

Agreed?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9 5/5] x86/KASLR: Allow randomization below load address
  2016-05-25 22:45 ` [PATCH v9 5/5] x86/KASLR: Allow randomization below load address Kees Cook
@ 2016-06-17  8:47   ` Ingo Molnar
  2016-06-17 15:44     ` Kees Cook
  2016-06-17 12:23   ` [tip:x86/boot] x86/KASLR: Allow randomization below the " tip-bot for Yinghai Lu
  2016-06-26 11:03   ` tip-bot for Yinghai Lu
  2 siblings, 1 reply; 21+ messages in thread
From: Ingo Molnar @ 2016-06-17  8:47 UTC (permalink / raw)
  To: Kees Cook
  Cc: Yinghai Lu, Borislav Petkov, Baoquan He, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, x86, Andrew Morton, Josh Poimboeuf,
	Andrey Ryabinin, H.J. Lu, Dmitry Vyukov, LKML


* Kees Cook <keescook@chromium.org> wrote:

> From: Yinghai Lu <yinghai@kernel.org>
> 
> Currently the physical randomization's lower boundary is the original
> kernel load address. For bootloaders that load kernels into very high
> memory (e.g. kexec), this means randomization takes place in a very small
> window at the top of memory, ignoring the large region of physical memory
> below the load address.
> 
> Since mem_avoid is already correctly tracking the regions that must be
> avoided, this patch changes the minimum address to whatever is less:
> 512M (to conservatively avoid unknown things in lower memory) or the
> load address. Now, for example, if the kernel is loaded at 8G, [512M,
> 8G) will be added into possible physical memory positions.
> 
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> [kees: rewrote changelog, refactor to use min()]
> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
>  arch/x86/boot/compressed/kaslr.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
> index d0a823df183b..304c5c369aff 100644
> --- a/arch/x86/boot/compressed/kaslr.c
> +++ b/arch/x86/boot/compressed/kaslr.c
> @@ -492,7 +492,7 @@ void choose_random_location(unsigned long input,
>  			    unsigned long output_size,
>  			    unsigned long *virt_addr)
>  {
> -	unsigned long random_addr;
> +	unsigned long random_addr, min_addr;
>  
>  	/* By default, keep output position unchanged. */
>  	*virt_addr = *output;
> @@ -517,8 +517,11 @@ void choose_random_location(unsigned long input,
>  	/* Record the various known unsafe memory ranges. */
>  	mem_avoid_init(input, input_size, *output);
>  
> +	/* Low end should be the smaller of 512M or initial location. */
> +	min_addr = min(*output, 512UL << 20);
> +
>  	/* Walk e820 and find a random address. */
> -	random_addr = find_random_phys_addr(*output, output_size);
> +	random_addr = find_random_phys_addr(min_addr, output_size);
>  	if (!random_addr) {
>  		warn("KASLR disabled: could not find suitable E820 region!");
>  	} else {

There's no explanation in the code or in the changelog of why 512M was picked as 
the lower limit.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tip:x86/boot] x86/boot: Refuse to build with data relocations
  2016-05-25 22:45 ` [PATCH v9 1/5] x86/boot: Refuse to build with data relocations Kees Cook
@ 2016-06-17 12:22   ` tip-bot for Kees Cook
  2016-06-26 11:01   ` tip-bot for Kees Cook
  1 sibling, 0 replies; 21+ messages in thread
From: tip-bot for Kees Cook @ 2016-06-17 12:22 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: keescook, peterz, akpm, hjl.tools, aryabinin, brgerst, hpa,
	jpoimboe, luto, torvalds, linux-kernel, tglx, yinghai, dvlasenk,
	mingo, bp, bhe, dvyukov

Commit-ID:  f26973633bfe01b67dbfc351aff8aec355811583
Gitweb:     http://git.kernel.org/tip/f26973633bfe01b67dbfc351aff8aec355811583
Author:     Kees Cook <keescook@chromium.org>
AuthorDate: Wed, 25 May 2016 15:45:30 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 17 Jun 2016 11:03:45 +0200

x86/boot: Refuse to build with data relocations

The compressed kernel is built with -fPIC/-fPIE so that it can run in any
location a bootloader happens to put it. However, since ELF relocation
processing is not happening (and all the relocation information has
already been stripped at link time), none of the code can use data
relocations (e.g. static assignments of pointers). This is already noted
in a warning comment at the top of misc.c, but this adds an explicit
check for the condition during the linking stage to block any such bugs
from appearing.

If this was in place with the earlier bug in pagetable.c, the build
would fail like this:

  ...
    CC      arch/x86/boot/compressed/pagetable.o
    DATAREL arch/x86/boot/compressed/vmlinux
  error: arch/x86/boot/compressed/pagetable.o has data relocations!
  make[2]: *** [arch/x86/boot/compressed/vmlinux] Error 1
  ...

A clean build shows:

  ...
    CC      arch/x86/boot/compressed/pagetable.o
    DATAREL arch/x86/boot/compressed/vmlinux
    LD      arch/x86/boot/compressed/vmlinux
  ...

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1464216334-17200-2-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/boot/compressed/Makefile | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index f135688..536ccfc 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -85,7 +85,25 @@ vmlinux-objs-$(CONFIG_EFI_STUB) += $(obj)/eboot.o $(obj)/efi_stub_$(BITS).o \
 	$(objtree)/drivers/firmware/efi/libstub/lib.a
 vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o
 
+# The compressed kernel is built with -fPIC/-fPIE so that a boot loader
+# can place it anywhere in memory and it will still run. However, since
+# it is executed as-is without any ELF relocation processing performed
+# (and has already had all relocation sections stripped from the binary),
+# none of the code can use data relocations (e.g. static assignments of
+# pointer values), since they will be meaningless at runtime. This check
+# will refuse to link the vmlinux if any of these relocations are found.
+quiet_cmd_check_data_rel = DATAREL $@
+define cmd_check_data_rel
+	for obj in $(filter %.o,$^); do \
+		readelf -S $$obj | grep -qF .rel.local && { \
+			echo "error: $$obj has data relocations!" >&2; \
+			exit 1; \
+		} || true; \
+	done
+endef
+
 $(obj)/vmlinux: $(vmlinux-objs-y) FORCE
+	$(call if_changed,check_data_rel)
 	$(call if_changed,ld)
 
 OBJCOPYFLAGS_vmlinux.bin :=  -R .comment -S

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip:x86/boot] x86/KASLR: Clarify identity map interface
  2016-05-25 22:45 ` [PATCH v9 2/5] x86/KASLR: Clarify identity map interface Kees Cook
@ 2016-06-17 12:22   ` tip-bot for Kees Cook
  2016-06-26 11:02   ` tip-bot for Kees Cook
  1 sibling, 0 replies; 21+ messages in thread
From: tip-bot for Kees Cook @ 2016-06-17 12:22 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: aryabinin, hpa, tglx, yinghai, dvlasenk, jpoimboe, bhe,
	hjl.tools, luto, linux-kernel, mingo, brgerst, akpm, torvalds,
	dvyukov, bp, keescook, peterz

Commit-ID:  00860f47c2ac6706060656f423b119af1f454bcd
Gitweb:     http://git.kernel.org/tip/00860f47c2ac6706060656f423b119af1f454bcd
Author:     Kees Cook <keescook@chromium.org>
AuthorDate: Wed, 25 May 2016 15:45:31 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 17 Jun 2016 11:03:46 +0200

x86/KASLR: Clarify identity map interface

This extracts the call to prepare_level4() into a top-level function
that the user of the pagetable.c interface must call to initialize
the new page tables. For clarity and to match the "finalize" function,
it has been renamed to initialize_identity_maps(). This function also
gains the initialization of mapping_info so we don't have to do it each
time in add_identity_map().

Additionally add copyright notice to the top, to make it clear that the
bulk of the pagetable.c code was written by Yinghai, and that I just
added bugs later. :)

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1464216334-17200-3-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/boot/compressed/kaslr.c     |  3 +++
 arch/x86/boot/compressed/misc.h      |  3 +++
 arch/x86/boot/compressed/pagetable.c | 26 ++++++++++++++++----------
 3 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index dff4217..54037c9 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -478,6 +478,9 @@ unsigned char *choose_random_location(unsigned long input,
 
 	boot_params->hdr.loadflags |= KASLR_FLAG;
 
+	/* Prepare to add new identity pagetables on demand. */
+	initialize_identity_maps();
+
 	/* Record the various known unsafe memory ranges. */
 	mem_avoid_init(input, input_size, output);
 
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index b6fec1f..09c4ddd 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -85,10 +85,13 @@ unsigned char *choose_random_location(unsigned long input_ptr,
 #endif
 
 #ifdef CONFIG_X86_64
+void initialize_identity_maps(void);
 void add_identity_map(unsigned long start, unsigned long size);
 void finalize_identity_maps(void);
 extern unsigned char _pgtable[];
 #else
+static inline void initialize_identity_maps(void)
+{ }
 static inline void add_identity_map(unsigned long start, unsigned long size)
 { }
 static inline void finalize_identity_maps(void)
diff --git a/arch/x86/boot/compressed/pagetable.c b/arch/x86/boot/compressed/pagetable.c
index 34b95df..6e31a6a 100644
--- a/arch/x86/boot/compressed/pagetable.c
+++ b/arch/x86/boot/compressed/pagetable.c
@@ -2,6 +2,9 @@
  * This code is used on x86_64 to create page table identity mappings on
  * demand by building up a new set of page tables (or appending to the
  * existing ones), and then switching over to them when ready.
+ *
+ * Copyright (C) 2015-2016  Yinghai Lu
+ * Copyright (C)      2016  Kees Cook
  */
 
 /*
@@ -59,9 +62,21 @@ static struct alloc_pgt_data pgt_data;
 /* The top level page table entry pointer. */
 static unsigned long level4p;
 
+/*
+ * Mapping information structure passed to kernel_ident_mapping_init().
+ * Due to relocation, pointers must be assigned at run time not build time.
+ */
+static struct x86_mapping_info mapping_info = {
+	.pmd_flag       = __PAGE_KERNEL_LARGE_EXEC,
+};
+
 /* Locates and clears a region for a new top level page table. */
-static void prepare_level4(void)
+void initialize_identity_maps(void)
 {
+	/* Init mapping_info with run-time function/buffer pointers. */
+	mapping_info.alloc_pgt_page = alloc_pgt_page;
+	mapping_info.context = &pgt_data;
+
 	/*
 	 * It should be impossible for this not to already be true,
 	 * but since calling this a second time would rewind the other
@@ -96,17 +111,8 @@ static void prepare_level4(void)
  */
 void add_identity_map(unsigned long start, unsigned long size)
 {
-	struct x86_mapping_info mapping_info = {
-		.alloc_pgt_page	= alloc_pgt_page,
-		.context	= &pgt_data,
-		.pmd_flag	= __PAGE_KERNEL_LARGE_EXEC,
-	};
 	unsigned long end = start + size;
 
-	/* Make sure we have a top level page table ready to use. */
-	if (!level4p)
-		prepare_level4();
-
 	/* Align boundary to 2M. */
 	start = round_down(start, PMD_SIZE);
 	end = round_up(end, PMD_SIZE);

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip:x86/boot] x86/KASLR: Randomize virtual address separately
  2016-05-25 22:45 ` [PATCH v9 3/5] x86/KASLR: Randomize virtual address separately Kees Cook
  2016-06-17  8:20   ` Ingo Molnar
@ 2016-06-17 12:22   ` tip-bot for Baoquan He
  2016-06-26 11:02   ` tip-bot for Baoquan He
  2 siblings, 0 replies; 21+ messages in thread
From: tip-bot for Baoquan He @ 2016-06-17 12:22 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: bhe, bp, hpa, mingo, yinghai, tglx, aryabinin, peterz, dvyukov,
	torvalds, dvlasenk, keescook, brgerst, akpm, jpoimboe, hjl.tools,
	linux-kernel, luto

Commit-ID:  ad908dc080e2d8ab26391d0013d2c8157ca0e2da
Gitweb:     http://git.kernel.org/tip/ad908dc080e2d8ab26391d0013d2c8157ca0e2da
Author:     Baoquan He <bhe@redhat.com>
AuthorDate: Wed, 25 May 2016 15:45:32 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 17 Jun 2016 11:03:47 +0200

x86/KASLR: Randomize virtual address separately

The current KASLR implementation randomizes the physical and virtual
addresses of the kernel together (both are offset by the same amount). It
calculates the delta of the physical address where vmlinux was linked
to load and where it is finally loaded. If the delta is not equal to 0
(i.e. the kernel was relocated), relocation handling needs be done.

On 64-bit, this patch randomizes both the physical address where kernel
is decompressed and the virtual address where kernel text is mapped and
will execute from. We now have two values being chosen, so the function
arguments are reorganized to pass by pointer so they can be directly
updated. Since relocation handling only depends on the virtual address,
we must check the virtual delta, not the physical delta for processing
kernel relocations. This also populates the page table for the new
virtual address range. 32-bit does not support a separate virtual address,
so it continues to use the physical offset for its virtual offset.

Additionally updates the sanity checks done on the resulting kernel
addresses since they are potentially separate now.

[kees: rewrote changelog, limited virtual split to 64-bit only, update checks]
[kees: fix CONFIG_RANDOMIZE_BASE=n boot failure]
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1464216334-17200-4-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/boot/compressed/kaslr.c | 41 +++++++++++++++++----------------
 arch/x86/boot/compressed/misc.c  | 49 ++++++++++++++++++++++++----------------
 arch/x86/boot/compressed/misc.h  | 22 ++++++++++--------
 3 files changed, 64 insertions(+), 48 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 54037c9..5550546 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -463,17 +463,20 @@ static unsigned long find_random_virt_addr(unsigned long minimum,
  * Since this function examines addresses much more numerically,
  * it takes the input and output pointers as 'unsigned long'.
  */
-unsigned char *choose_random_location(unsigned long input,
-				      unsigned long input_size,
-				      unsigned long output,
-				      unsigned long output_size)
+void choose_random_location(unsigned long input,
+			    unsigned long input_size,
+			    unsigned long *output,
+			    unsigned long output_size,
+			    unsigned long *virt_addr)
 {
-	unsigned long choice = output;
 	unsigned long random_addr;
 
+	/* By default, keep output position unchanged. */
+	*virt_addr = *output;
+
 	if (cmdline_find_option_bool("nokaslr")) {
 		warn("KASLR disabled: 'nokaslr' on cmdline.");
-		goto out;
+		return;
 	}
 
 	boot_params->hdr.loadflags |= KASLR_FLAG;
@@ -482,25 +485,25 @@ unsigned char *choose_random_location(unsigned long input,
 	initialize_identity_maps();
 
 	/* Record the various known unsafe memory ranges. */
-	mem_avoid_init(input, input_size, output);
+	mem_avoid_init(input, input_size, *output);
 
 	/* Walk e820 and find a random address. */
-	random_addr = find_random_phys_addr(output, output_size);
+	random_addr = find_random_phys_addr(*output, output_size);
 	if (!random_addr) {
 		warn("KASLR disabled: could not find suitable E820 region!");
-		goto out;
+	} else {
+		/* Update the new physical address location. */
+		if (*output != random_addr) {
+			add_identity_map(random_addr, output_size);
+			*output = random_addr;
+		}
 	}
 
-	/* Always enforce the minimum. */
-	if (random_addr < choice)
-		goto out;
-
-	choice = random_addr;
-
-	add_identity_map(choice, output_size);
-
 	/* This actually loads the identity pagetable on x86_64. */
 	finalize_identity_maps();
-out:
-	return (unsigned char *)choice;
+
+	/* Pick random virtual address starting from LOAD_PHYSICAL_ADDR. */
+	if (IS_ENABLED(CONFIG_X86_64))
+		random_addr = find_random_virt_addr(LOAD_PHYSICAL_ADDR, output_size);
+	*virt_addr = random_addr;
 }
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index f14db4e..b3c5a5f0 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -170,7 +170,8 @@ void __puthex(unsigned long value)
 }
 
 #if CONFIG_X86_NEED_RELOCS
-static void handle_relocations(void *output, unsigned long output_len)
+static void handle_relocations(void *output, unsigned long output_len,
+			       unsigned long virt_addr)
 {
 	int *reloc;
 	unsigned long delta, map, ptr;
@@ -182,11 +183,6 @@ static void handle_relocations(void *output, unsigned long output_len)
 	 * and where it was actually loaded.
 	 */
 	delta = min_addr - LOAD_PHYSICAL_ADDR;
-	if (!delta) {
-		debug_putstr("No relocation needed... ");
-		return;
-	}
-	debug_putstr("Performing relocations... ");
 
 	/*
 	 * The kernel contains a table of relocation addresses. Those
@@ -198,6 +194,20 @@ static void handle_relocations(void *output, unsigned long output_len)
 	map = delta - __START_KERNEL_map;
 
 	/*
+	 * 32-bit always performs relocations. 64-bit relocations are only
+	 * needed if KASLR has chosen a different starting address offset
+	 * from __START_KERNEL_map.
+	 */
+	if (IS_ENABLED(CONFIG_X86_64))
+		delta = virt_addr - LOAD_PHYSICAL_ADDR;
+
+	if (!delta) {
+		debug_putstr("No relocation needed... ");
+		return;
+	}
+	debug_putstr("Performing relocations... ");
+
+	/*
 	 * Process relocations: 32 bit relocations first then 64 bit after.
 	 * Three sets of binary relocations are added to the end of the kernel
 	 * before compression. Each relocation table entry is the kernel
@@ -250,7 +260,8 @@ static void handle_relocations(void *output, unsigned long output_len)
 #endif
 }
 #else
-static inline void handle_relocations(void *output, unsigned long output_len)
+static inline void handle_relocations(void *output, unsigned long output_len,
+				      unsigned long virt_addr)
 { }
 #endif
 
@@ -327,7 +338,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
 				  unsigned long output_len)
 {
 	const unsigned long kernel_total_size = VO__end - VO__text;
-	unsigned char *output_orig = output;
+	unsigned long virt_addr = (unsigned long)output;
 
 	/* Retain x86 boot parameters pointer passed from startup_32/64. */
 	boot_params = rmode;
@@ -366,13 +377,16 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
 	 * the entire decompressed kernel plus relocation table, or the
 	 * entire decompressed kernel plus .bss and .brk sections.
 	 */
-	output = choose_random_location((unsigned long)input_data, input_len,
-					(unsigned long)output,
-					max(output_len, kernel_total_size));
+	choose_random_location((unsigned long)input_data, input_len,
+				(unsigned long *)&output,
+				max(output_len, kernel_total_size),
+				&virt_addr);
 
 	/* Validate memory location choices. */
 	if ((unsigned long)output & (MIN_KERNEL_ALIGN - 1))
-		error("Destination address inappropriately aligned");
+		error("Destination physical address inappropriately aligned");
+	if (virt_addr & (MIN_KERNEL_ALIGN - 1))
+		error("Destination virtual address inappropriately aligned");
 #ifdef CONFIG_X86_64
 	if (heap > 0x3fffffffffffUL)
 		error("Destination address too large");
@@ -382,19 +396,16 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
 #endif
 #ifndef CONFIG_RELOCATABLE
 	if ((unsigned long)output != LOAD_PHYSICAL_ADDR)
-		error("Wrong destination address");
+		error("Destination address does not match LOAD_PHYSICAL_ADDR");
+	if ((unsigned long)output != virt_addr)
+		error("Destination virtual address changed when not relocatable");
 #endif
 
 	debug_putstr("\nDecompressing Linux... ");
 	__decompress(input_data, input_len, NULL, NULL, output, output_len,
 			NULL, error);
 	parse_elf(output);
-	/*
-	 * 32-bit always performs relocations. 64-bit relocations are only
-	 * needed if kASLR has chosen a different load address.
-	 */
-	if (!IS_ENABLED(CONFIG_X86_64) || output != output_orig)
-		handle_relocations(output, output_len);
+	handle_relocations(output, output_len, virt_addr);
 	debug_putstr("done.\nBooting the kernel.\n");
 	return output;
 }
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 09c4ddd..1c8355e 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -67,20 +67,22 @@ int cmdline_find_option_bool(const char *option);
 
 #if CONFIG_RANDOMIZE_BASE
 /* kaslr.c */
-unsigned char *choose_random_location(unsigned long input_ptr,
-				      unsigned long input_size,
-				      unsigned long output_ptr,
-				      unsigned long output_size);
+void choose_random_location(unsigned long input,
+			    unsigned long input_size,
+			    unsigned long *output,
+			    unsigned long output_size,
+			    unsigned long *virt_addr);
 /* cpuflags.c */
 bool has_cpuflag(int flag);
 #else
-static inline
-unsigned char *choose_random_location(unsigned long input_ptr,
-				      unsigned long input_size,
-				      unsigned long output_ptr,
-				      unsigned long output_size)
+static inline void choose_random_location(unsigned long input,
+					  unsigned long input_size,
+					  unsigned long *output,
+					  unsigned long output_size,
+					  unsigned long *virt_addr)
 {
-	return (unsigned char *)output_ptr;
+	/* No change from existing output location. */
+	*virt_addr = *output;
 }
 #endif
 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip:x86/boot] x86/KASLR: Extend kernel image physical address randomization to addresses larger than 4G
  2016-05-25 22:45 ` [PATCH v9 4/5] x86/KASLR: Add physical address randomization >4G Kees Cook
@ 2016-06-17 12:23   ` tip-bot for Kees Cook
  2016-06-26 11:02   ` tip-bot for Kees Cook
  1 sibling, 0 replies; 21+ messages in thread
From: tip-bot for Kees Cook @ 2016-06-17 12:23 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hjl.tools, jpoimboe, hpa, yinghai, aryabinin, akpm, linux-kernel,
	tglx, dvyukov, peterz, brgerst, torvalds, bp, mingo, dvlasenk,
	keescook, luto, bhe

Commit-ID:  9099fab617cd0518cc537849623fb440315b3c91
Gitweb:     http://git.kernel.org/tip/9099fab617cd0518cc537849623fb440315b3c91
Author:     Kees Cook <keescook@chromium.org>
AuthorDate: Wed, 25 May 2016 15:45:33 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 17 Jun 2016 11:03:48 +0200

x86/KASLR: Extend kernel image physical address randomization to addresses larger than 4G

We want the physical address to be randomized anywhere between
16MB and the top of physical memory (up to 64TB).

This patch exchanges the prior slots[] array for the new slot_areas[]
array, and lifts the limitation of KERNEL_IMAGE_SIZE on the physical
address offset for 64-bit. As before, process_e820_entry() walks
memory and populates slot_areas[], splitting on any detected mem_avoid
collisions.

Finally, since the slots[] array and its associated functions are not
needed any more, so they are removed.

Based on earlier patches by Baoquan He.

Originally-from: Baoquan He <bhe@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1464216334-17200-5-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/Kconfig                 |  27 +++++----
 arch/x86/boot/compressed/kaslr.c | 115 +++++++++++++++++++++++----------------
 2 files changed, 85 insertions(+), 57 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 0a7b885..770ae52 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1934,21 +1934,26 @@ config RANDOMIZE_BASE
 	  attempts relying on knowledge of the location of kernel
 	  code internals.
 
-	  The kernel physical and virtual address can be randomized
-	  from 16MB up to 1GB on 64-bit and 512MB on 32-bit. (Note that
-	  using RANDOMIZE_BASE reduces the memory space available to
-	  kernel modules from 1.5GB to 1GB.)
+	  On 64-bit, the kernel physical and virtual addresses are
+	  randomized separately. The physical address will be anywhere
+	  between 16MB and the top of physical memory (up to 64TB). The
+	  virtual address will be randomized from 16MB up to 1GB (9 bits
+	  of entropy). Note that this also reduces the memory space
+	  available to kernel modules from 1.5GB to 1GB.
+
+	  On 32-bit, the kernel physical and virtual addresses are
+	  randomized together. They will be randomized from 16MB up to
+	  512MB (8 bits of entropy).
 
 	  Entropy is generated using the RDRAND instruction if it is
 	  supported. If RDTSC is supported, its value is mixed into
 	  the entropy pool as well. If neither RDRAND nor RDTSC are
-	  supported, then entropy is read from the i8254 timer.
-
-	  Since the kernel is built using 2GB addressing, and
-	  PHYSICAL_ALIGN must be at a minimum of 2MB, only 10 bits of
-	  entropy is theoretically possible. Currently, with the
-	  default value for PHYSICAL_ALIGN and due to page table
-	  layouts, 64-bit uses 9 bits of entropy and 32-bit uses 8 bits.
+	  supported, then entropy is read from the i8254 timer. The
+	  usable entropy is limited by the kernel being built using
+	  2GB addressing, and that PHYSICAL_ALIGN must be at a
+	  minimum of 2MB. As a result, only 10 bits of entropy are
+	  theoretically possible, but the implementations are further
+	  limited due to memory layouts.
 
 	  If CONFIG_HIBERNATE is also enabled, KASLR is disabled at boot
 	  time. To enable it, boot with "kaslr" on the kernel command
diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 5550546..36e2811 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -132,17 +132,6 @@ enum mem_avoid_index {
 
 static struct mem_vector mem_avoid[MEM_AVOID_MAX];
 
-static bool mem_contains(struct mem_vector *region, struct mem_vector *item)
-{
-	/* Item at least partially before region. */
-	if (item->start < region->start)
-		return false;
-	/* Item at least partially after region. */
-	if (item->start + item->size > region->start + region->size)
-		return false;
-	return true;
-}
-
 static bool mem_overlaps(struct mem_vector *one, struct mem_vector *two)
 {
 	/* Item one is entirely before item two. */
@@ -319,8 +308,6 @@ static bool mem_avoid_overlap(struct mem_vector *img,
 	return is_overlapping;
 }
 
-static unsigned long slots[KERNEL_IMAGE_SIZE / CONFIG_PHYSICAL_ALIGN];
-
 struct slot_area {
 	unsigned long addr;
 	int num;
@@ -351,36 +338,44 @@ static void store_slot_info(struct mem_vector *region, unsigned long image_size)
 	}
 }
 
-static void slots_append(unsigned long addr)
-{
-	/* Overflowing the slots list should be impossible. */
-	if (slot_max >= KERNEL_IMAGE_SIZE / CONFIG_PHYSICAL_ALIGN)
-		return;
-
-	slots[slot_max++] = addr;
-}
-
 static unsigned long slots_fetch_random(void)
 {
+	unsigned long slot;
+	int i;
+
 	/* Handle case of no slots stored. */
 	if (slot_max == 0)
 		return 0;
 
-	return slots[get_random_long("Physical") % slot_max];
+	slot = get_random_long("Physical") % slot_max;
+
+	for (i = 0; i < slot_area_index; i++) {
+		if (slot >= slot_areas[i].num) {
+			slot -= slot_areas[i].num;
+			continue;
+		}
+		return slot_areas[i].addr + slot * CONFIG_PHYSICAL_ALIGN;
+	}
+
+	if (i == slot_area_index)
+		debug_putstr("slots_fetch_random() failed!?\n");
+	return 0;
 }
 
 static void process_e820_entry(struct e820entry *entry,
 			       unsigned long minimum,
 			       unsigned long image_size)
 {
-	struct mem_vector region, img, overlap;
+	struct mem_vector region, overlap;
+	struct slot_area slot_area;
+	unsigned long start_orig;
 
 	/* Skip non-RAM entries. */
 	if (entry->type != E820_RAM)
 		return;
 
-	/* Ignore entries entirely above our maximum. */
-	if (entry->addr >= KERNEL_IMAGE_SIZE)
+	/* On 32-bit, ignore entries entirely above our maximum. */
+	if (IS_ENABLED(CONFIG_X86_32) && entry->addr >= KERNEL_IMAGE_SIZE)
 		return;
 
 	/* Ignore entries entirely below our minimum. */
@@ -390,31 +385,55 @@ static void process_e820_entry(struct e820entry *entry,
 	region.start = entry->addr;
 	region.size = entry->size;
 
-	/* Potentially raise address to minimum location. */
-	if (region.start < minimum)
-		region.start = minimum;
+	/* Give up if slot area array is full. */
+	while (slot_area_index < MAX_SLOT_AREA) {
+		start_orig = region.start;
 
-	/* Potentially raise address to meet alignment requirements. */
-	region.start = ALIGN(region.start, CONFIG_PHYSICAL_ALIGN);
+		/* Potentially raise address to minimum location. */
+		if (region.start < minimum)
+			region.start = minimum;
 
-	/* Did we raise the address above the bounds of this e820 region? */
-	if (region.start > entry->addr + entry->size)
-		return;
+		/* Potentially raise address to meet alignment needs. */
+		region.start = ALIGN(region.start, CONFIG_PHYSICAL_ALIGN);
 
-	/* Reduce size by any delta from the original address. */
-	region.size -= region.start - entry->addr;
+		/* Did we raise the address above this e820 region? */
+		if (region.start > entry->addr + entry->size)
+			return;
 
-	/* Reduce maximum size to fit end of image within maximum limit. */
-	if (region.start + region.size > KERNEL_IMAGE_SIZE)
-		region.size = KERNEL_IMAGE_SIZE - region.start;
+		/* Reduce size by any delta from the original address. */
+		region.size -= region.start - start_orig;
 
-	/* Walk each aligned slot and check for avoided areas. */
-	for (img.start = region.start, img.size = image_size ;
-	     mem_contains(&region, &img) ;
-	     img.start += CONFIG_PHYSICAL_ALIGN) {
-		if (mem_avoid_overlap(&img, &overlap))
-			continue;
-		slots_append(img.start);
+		/* On 32-bit, reduce region size to fit within max size. */
+		if (IS_ENABLED(CONFIG_X86_32) &&
+		    region.start + region.size > KERNEL_IMAGE_SIZE)
+			region.size = KERNEL_IMAGE_SIZE - region.start;
+
+		/* Return if region can't contain decompressed kernel */
+		if (region.size < image_size)
+			return;
+
+		/* If nothing overlaps, store the region and return. */
+		if (!mem_avoid_overlap(&region, &overlap)) {
+			store_slot_info(&region, image_size);
+			return;
+		}
+
+		/* Store beginning of region if holds at least image_size. */
+		if (overlap.start > region.start + image_size) {
+			struct mem_vector beginning;
+
+			beginning.start = region.start;
+			beginning.size = overlap.start - region.start;
+			store_slot_info(&beginning, image_size);
+		}
+
+		/* Return if overlap extends to or past end of region. */
+		if (overlap.start + overlap.size >= region.start + region.size)
+			return;
+
+		/* Clip off the overlapping region and start over. */
+		region.size -= overlap.start - region.start + overlap.size;
+		region.start = overlap.start + overlap.size;
 	}
 }
 
@@ -431,6 +450,10 @@ static unsigned long find_random_phys_addr(unsigned long minimum,
 	for (i = 0; i < boot_params->e820_entries; i++) {
 		process_e820_entry(&boot_params->e820_map[i], minimum,
 				   image_size);
+		if (slot_area_index == MAX_SLOT_AREA) {
+			debug_putstr("Aborted e820 scan (slot_areas full)!\n");
+			break;
+		}
 	}
 
 	return slots_fetch_random();

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip:x86/boot] x86/KASLR: Allow randomization below the load address
  2016-05-25 22:45 ` [PATCH v9 5/5] x86/KASLR: Allow randomization below load address Kees Cook
  2016-06-17  8:47   ` Ingo Molnar
@ 2016-06-17 12:23   ` tip-bot for Yinghai Lu
  2016-06-26 11:03   ` tip-bot for Yinghai Lu
  2 siblings, 0 replies; 21+ messages in thread
From: tip-bot for Yinghai Lu @ 2016-06-17 12:23 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: keescook, dvyukov, tglx, jpoimboe, hpa, torvalds, dvlasenk, luto,
	yinghai, peterz, hjl.tools, mingo, akpm, linux-kernel, bhe,
	brgerst, aryabinin, bp

Commit-ID:  00bdbb0a0d6e5c7235cb8faa298c9f494e088499
Gitweb:     http://git.kernel.org/tip/00bdbb0a0d6e5c7235cb8faa298c9f494e088499
Author:     Yinghai Lu <yinghai@kernel.org>
AuthorDate: Wed, 25 May 2016 15:45:34 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Fri, 17 Jun 2016 11:03:49 +0200

x86/KASLR: Allow randomization below the load address

Currently the kernel image physical address randomization's lower
boundary is the original kernel load address.

For bootloaders that load kernels into very high memory (e.g. kexec),
this means randomization takes place in a very small window at the
top of memory, ignoring the large region of physical memory below
the load address.

Since mem_avoid[] is already correctly tracking the regions that must be
avoided, this patch changes the minimum address to whatever is less:
512M (to conservatively avoid unknown things in lower memory) or the
load address. Now, for example, if the kernel is loaded at 8G, [512M,
8G) will be added to the list of possible physical memory positions.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
[ Rewrote the changelog, refactored the code to use min(). ]
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1464216334-17200-6-git-send-email-keescook@chromium.org
[ Edited the changelog some more, plus the code comment as well. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/boot/compressed/kaslr.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 36e2811..749c9e0 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -492,7 +492,7 @@ void choose_random_location(unsigned long input,
 			    unsigned long output_size,
 			    unsigned long *virt_addr)
 {
-	unsigned long random_addr;
+	unsigned long random_addr, min_addr;
 
 	/* By default, keep output position unchanged. */
 	*virt_addr = *output;
@@ -510,8 +510,15 @@ void choose_random_location(unsigned long input,
 	/* Record the various known unsafe memory ranges. */
 	mem_avoid_init(input, input_size, *output);
 
+	/*
+	 * Low end of the randomization range should be the
+	 * smaller of 512M or the initial kernel image
+	 * location:
+	 */
+	min_addr = min(*output, 512UL << 20);
+
 	/* Walk e820 and find a random address. */
-	random_addr = find_random_phys_addr(*output, output_size);
+	random_addr = find_random_phys_addr(min_addr, output_size);
 	if (!random_addr) {
 		warn("KASLR disabled: could not find suitable E820 region!");
 	} else {

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v9 5/5] x86/KASLR: Allow randomization below load address
  2016-06-17  8:47   ` Ingo Molnar
@ 2016-06-17 15:44     ` Kees Cook
  2016-06-17 18:44       ` Yinghai Lu
  0 siblings, 1 reply; 21+ messages in thread
From: Kees Cook @ 2016-06-17 15:44 UTC (permalink / raw)
  To: Yinghai Lu, Baoquan He
  Cc: Ingo Molnar, Borislav Petkov, H. Peter Anvin, Thomas Gleixner,
	Ingo Molnar, x86, Andrew Morton, Josh Poimboeuf, Andrey Ryabinin,
	H.J. Lu, Dmitry Vyukov, LKML

On Fri, Jun 17, 2016 at 1:47 AM, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Kees Cook <keescook@chromium.org> wrote:
>
>> From: Yinghai Lu <yinghai@kernel.org>
>>
>> Currently the physical randomization's lower boundary is the original
>> kernel load address. For bootloaders that load kernels into very high
>> memory (e.g. kexec), this means randomization takes place in a very small
>> window at the top of memory, ignoring the large region of physical memory
>> below the load address.
>>
>> Since mem_avoid is already correctly tracking the regions that must be
>> avoided, this patch changes the minimum address to whatever is less:
>> 512M (to conservatively avoid unknown things in lower memory) or the
>> load address. Now, for example, if the kernel is loaded at 8G, [512M,
>> 8G) will be added into possible physical memory positions.
>>
>> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
>> [kees: rewrote changelog, refactor to use min()]
>> Signed-off-by: Kees Cook <keescook@chromium.org>
>> ---
>>  arch/x86/boot/compressed/kaslr.c | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
>> index d0a823df183b..304c5c369aff 100644
>> --- a/arch/x86/boot/compressed/kaslr.c
>> +++ b/arch/x86/boot/compressed/kaslr.c
>> @@ -492,7 +492,7 @@ void choose_random_location(unsigned long input,
>>                           unsigned long output_size,
>>                           unsigned long *virt_addr)
>>  {
>> -     unsigned long random_addr;
>> +     unsigned long random_addr, min_addr;
>>
>>       /* By default, keep output position unchanged. */
>>       *virt_addr = *output;
>> @@ -517,8 +517,11 @@ void choose_random_location(unsigned long input,
>>       /* Record the various known unsafe memory ranges. */
>>       mem_avoid_init(input, input_size, *output);
>>
>> +     /* Low end should be the smaller of 512M or initial location. */
>> +     min_addr = min(*output, 512UL << 20);
>> +
>>       /* Walk e820 and find a random address. */
>> -     random_addr = find_random_phys_addr(*output, output_size);
>> +     random_addr = find_random_phys_addr(min_addr, output_size);
>>       if (!random_addr) {
>>               warn("KASLR disabled: could not find suitable E820 region!");
>>       } else {
>
> There's no explanation in the code or in the changelog of why 512M was picked as
> the lower limit.

Yinghai, do you have a rationale for this selection? I understood it
to just be a very conservative target to avoid anything in low
physical memory, but perhaps there is a better reason?

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9 5/5] x86/KASLR: Allow randomization below load address
  2016-06-17 15:44     ` Kees Cook
@ 2016-06-17 18:44       ` Yinghai Lu
  0 siblings, 0 replies; 21+ messages in thread
From: Yinghai Lu @ 2016-06-17 18:44 UTC (permalink / raw)
  To: Kees Cook
  Cc: Baoquan He, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	Thomas Gleixner, Ingo Molnar, x86, Andrew Morton, Josh Poimboeuf,
	Andrey Ryabinin, H.J. Lu, Dmitry Vyukov, LKML

On Fri, Jun 17, 2016 at 8:44 AM, Kees Cook <keescook@chromium.org> wrote:
>>
>> There's no explanation in the code or in the changelog of why 512M was picked as
>> the lower limit.
>
> Yinghai, do you have a rationale for this selection? I understood it
> to just be a very conservative target to avoid anything in low
> physical memory, but perhaps there is a better reason?

when kernel is not loaded high at first, then *output should be 16M or so,
so no change.

when kernel is loaded high to save the low address space, don't want to
KASL to pull back to low address again.
If choose 4G, on 4G+512M config, when kernel is loaded high, kasl may
not work to chose range from 4G.
so I choose 512M, just stay away range for KERNEL_IMAGE_SIZE.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tip:x86/boot] x86/boot: Refuse to build with data relocations
  2016-05-25 22:45 ` [PATCH v9 1/5] x86/boot: Refuse to build with data relocations Kees Cook
  2016-06-17 12:22   ` [tip:x86/boot] " tip-bot for Kees Cook
@ 2016-06-26 11:01   ` tip-bot for Kees Cook
  1 sibling, 0 replies; 21+ messages in thread
From: tip-bot for Kees Cook @ 2016-06-26 11:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: torvalds, tglx, mingo, linux-kernel, keescook, aryabinin,
	hjl.tools, dvyukov, hpa, brgerst, bp, akpm, bhe, luto, yinghai,
	peterz, jpoimboe, dvlasenk

Commit-ID:  98f78525371b55ccd1c480207ce10296c72fa340
Gitweb:     http://git.kernel.org/tip/98f78525371b55ccd1c480207ce10296c72fa340
Author:     Kees Cook <keescook@chromium.org>
AuthorDate: Wed, 25 May 2016 15:45:30 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sun, 26 Jun 2016 12:32:03 +0200

x86/boot: Refuse to build with data relocations

The compressed kernel is built with -fPIC/-fPIE so that it can run in any
location a bootloader happens to put it. However, since ELF relocation
processing is not happening (and all the relocation information has
already been stripped at link time), none of the code can use data
relocations (e.g. static assignments of pointers). This is already noted
in a warning comment at the top of misc.c, but this adds an explicit
check for the condition during the linking stage to block any such bugs
from appearing.

If this was in place with the earlier bug in pagetable.c, the build
would fail like this:

  ...
    CC      arch/x86/boot/compressed/pagetable.o
    DATAREL arch/x86/boot/compressed/vmlinux
  error: arch/x86/boot/compressed/pagetable.o has data relocations!
  make[2]: *** [arch/x86/boot/compressed/vmlinux] Error 1
  ...

A clean build shows:

  ...
    CC      arch/x86/boot/compressed/pagetable.o
    DATAREL arch/x86/boot/compressed/vmlinux
    LD      arch/x86/boot/compressed/vmlinux
  ...

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1464216334-17200-2-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/boot/compressed/Makefile | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile
index f135688..536ccfc 100644
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -85,7 +85,25 @@ vmlinux-objs-$(CONFIG_EFI_STUB) += $(obj)/eboot.o $(obj)/efi_stub_$(BITS).o \
 	$(objtree)/drivers/firmware/efi/libstub/lib.a
 vmlinux-objs-$(CONFIG_EFI_MIXED) += $(obj)/efi_thunk_$(BITS).o
 
+# The compressed kernel is built with -fPIC/-fPIE so that a boot loader
+# can place it anywhere in memory and it will still run. However, since
+# it is executed as-is without any ELF relocation processing performed
+# (and has already had all relocation sections stripped from the binary),
+# none of the code can use data relocations (e.g. static assignments of
+# pointer values), since they will be meaningless at runtime. This check
+# will refuse to link the vmlinux if any of these relocations are found.
+quiet_cmd_check_data_rel = DATAREL $@
+define cmd_check_data_rel
+	for obj in $(filter %.o,$^); do \
+		readelf -S $$obj | grep -qF .rel.local && { \
+			echo "error: $$obj has data relocations!" >&2; \
+			exit 1; \
+		} || true; \
+	done
+endef
+
 $(obj)/vmlinux: $(vmlinux-objs-y) FORCE
+	$(call if_changed,check_data_rel)
 	$(call if_changed,ld)
 
 OBJCOPYFLAGS_vmlinux.bin :=  -R .comment -S

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip:x86/boot] x86/KASLR: Clarify identity map interface
  2016-05-25 22:45 ` [PATCH v9 2/5] x86/KASLR: Clarify identity map interface Kees Cook
  2016-06-17 12:22   ` [tip:x86/boot] " tip-bot for Kees Cook
@ 2016-06-26 11:02   ` tip-bot for Kees Cook
  1 sibling, 0 replies; 21+ messages in thread
From: tip-bot for Kees Cook @ 2016-06-26 11:02 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hjl.tools, jpoimboe, torvalds, bp, brgerst, yinghai, dvlasenk,
	luto, peterz, dvyukov, bhe, hpa, aryabinin, linux-kernel, mingo,
	keescook, akpm, tglx

Commit-ID:  11fdf97a3cd1a5a27625f820ceb74e1caba4fd26
Gitweb:     http://git.kernel.org/tip/11fdf97a3cd1a5a27625f820ceb74e1caba4fd26
Author:     Kees Cook <keescook@chromium.org>
AuthorDate: Wed, 25 May 2016 15:45:31 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sun, 26 Jun 2016 12:32:04 +0200

x86/KASLR: Clarify identity map interface

This extracts the call to prepare_level4() into a top-level function
that the user of the pagetable.c interface must call to initialize
the new page tables. For clarity and to match the "finalize" function,
it has been renamed to initialize_identity_maps(). This function also
gains the initialization of mapping_info so we don't have to do it each
time in add_identity_map().

Additionally add copyright notice to the top, to make it clear that the
bulk of the pagetable.c code was written by Yinghai, and that I just
added bugs later. :)

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1464216334-17200-3-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/boot/compressed/kaslr.c     |  3 +++
 arch/x86/boot/compressed/misc.h      |  3 +++
 arch/x86/boot/compressed/pagetable.c | 26 ++++++++++++++++----------
 3 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index dff4217..54037c9 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -478,6 +478,9 @@ unsigned char *choose_random_location(unsigned long input,
 
 	boot_params->hdr.loadflags |= KASLR_FLAG;
 
+	/* Prepare to add new identity pagetables on demand. */
+	initialize_identity_maps();
+
 	/* Record the various known unsafe memory ranges. */
 	mem_avoid_init(input, input_size, output);
 
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index b6fec1f..09c4ddd 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -85,10 +85,13 @@ unsigned char *choose_random_location(unsigned long input_ptr,
 #endif
 
 #ifdef CONFIG_X86_64
+void initialize_identity_maps(void);
 void add_identity_map(unsigned long start, unsigned long size);
 void finalize_identity_maps(void);
 extern unsigned char _pgtable[];
 #else
+static inline void initialize_identity_maps(void)
+{ }
 static inline void add_identity_map(unsigned long start, unsigned long size)
 { }
 static inline void finalize_identity_maps(void)
diff --git a/arch/x86/boot/compressed/pagetable.c b/arch/x86/boot/compressed/pagetable.c
index 34b95df..6e31a6a 100644
--- a/arch/x86/boot/compressed/pagetable.c
+++ b/arch/x86/boot/compressed/pagetable.c
@@ -2,6 +2,9 @@
  * This code is used on x86_64 to create page table identity mappings on
  * demand by building up a new set of page tables (or appending to the
  * existing ones), and then switching over to them when ready.
+ *
+ * Copyright (C) 2015-2016  Yinghai Lu
+ * Copyright (C)      2016  Kees Cook
  */
 
 /*
@@ -59,9 +62,21 @@ static struct alloc_pgt_data pgt_data;
 /* The top level page table entry pointer. */
 static unsigned long level4p;
 
+/*
+ * Mapping information structure passed to kernel_ident_mapping_init().
+ * Due to relocation, pointers must be assigned at run time not build time.
+ */
+static struct x86_mapping_info mapping_info = {
+	.pmd_flag       = __PAGE_KERNEL_LARGE_EXEC,
+};
+
 /* Locates and clears a region for a new top level page table. */
-static void prepare_level4(void)
+void initialize_identity_maps(void)
 {
+	/* Init mapping_info with run-time function/buffer pointers. */
+	mapping_info.alloc_pgt_page = alloc_pgt_page;
+	mapping_info.context = &pgt_data;
+
 	/*
 	 * It should be impossible for this not to already be true,
 	 * but since calling this a second time would rewind the other
@@ -96,17 +111,8 @@ static void prepare_level4(void)
  */
 void add_identity_map(unsigned long start, unsigned long size)
 {
-	struct x86_mapping_info mapping_info = {
-		.alloc_pgt_page	= alloc_pgt_page,
-		.context	= &pgt_data,
-		.pmd_flag	= __PAGE_KERNEL_LARGE_EXEC,
-	};
 	unsigned long end = start + size;
 
-	/* Make sure we have a top level page table ready to use. */
-	if (!level4p)
-		prepare_level4();
-
 	/* Align boundary to 2M. */
 	start = round_down(start, PMD_SIZE);
 	end = round_up(end, PMD_SIZE);

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip:x86/boot] x86/KASLR: Randomize virtual address separately
  2016-05-25 22:45 ` [PATCH v9 3/5] x86/KASLR: Randomize virtual address separately Kees Cook
  2016-06-17  8:20   ` Ingo Molnar
  2016-06-17 12:22   ` [tip:x86/boot] " tip-bot for Baoquan He
@ 2016-06-26 11:02   ` tip-bot for Baoquan He
  2 siblings, 0 replies; 21+ messages in thread
From: tip-bot for Baoquan He @ 2016-06-26 11:02 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dvyukov, bp, aryabinin, luto, brgerst, hjl.tools, mingo,
	dvlasenk, linux-kernel, hpa, jpoimboe, keescook, peterz, yinghai,
	tglx, akpm, torvalds, bhe

Commit-ID:  8391c73c96f28d4e8c40fd401fd0c9c04391b44a
Gitweb:     http://git.kernel.org/tip/8391c73c96f28d4e8c40fd401fd0c9c04391b44a
Author:     Baoquan He <bhe@redhat.com>
AuthorDate: Wed, 25 May 2016 15:45:32 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sun, 26 Jun 2016 12:32:04 +0200

x86/KASLR: Randomize virtual address separately

The current KASLR implementation randomizes the physical and virtual
addresses of the kernel together (both are offset by the same amount). It
calculates the delta of the physical address where vmlinux was linked
to load and where it is finally loaded. If the delta is not equal to 0
(i.e. the kernel was relocated), relocation handling needs be done.

On 64-bit, this patch randomizes both the physical address where kernel
is decompressed and the virtual address where kernel text is mapped and
will execute from. We now have two values being chosen, so the function
arguments are reorganized to pass by pointer so they can be directly
updated. Since relocation handling only depends on the virtual address,
we must check the virtual delta, not the physical delta for processing
kernel relocations. This also populates the page table for the new
virtual address range. 32-bit does not support a separate virtual address,
so it continues to use the physical offset for its virtual offset.

Additionally updates the sanity checks done on the resulting kernel
addresses since they are potentially separate now.

[kees: rewrote changelog, limited virtual split to 64-bit only, update checks]
[kees: fix CONFIG_RANDOMIZE_BASE=n boot failure]
Signed-off-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1464216334-17200-4-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/boot/compressed/kaslr.c | 41 +++++++++++++++++----------------
 arch/x86/boot/compressed/misc.c  | 49 ++++++++++++++++++++++++----------------
 arch/x86/boot/compressed/misc.h  | 22 ++++++++++--------
 3 files changed, 64 insertions(+), 48 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 54037c9..5550546 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -463,17 +463,20 @@ static unsigned long find_random_virt_addr(unsigned long minimum,
  * Since this function examines addresses much more numerically,
  * it takes the input and output pointers as 'unsigned long'.
  */
-unsigned char *choose_random_location(unsigned long input,
-				      unsigned long input_size,
-				      unsigned long output,
-				      unsigned long output_size)
+void choose_random_location(unsigned long input,
+			    unsigned long input_size,
+			    unsigned long *output,
+			    unsigned long output_size,
+			    unsigned long *virt_addr)
 {
-	unsigned long choice = output;
 	unsigned long random_addr;
 
+	/* By default, keep output position unchanged. */
+	*virt_addr = *output;
+
 	if (cmdline_find_option_bool("nokaslr")) {
 		warn("KASLR disabled: 'nokaslr' on cmdline.");
-		goto out;
+		return;
 	}
 
 	boot_params->hdr.loadflags |= KASLR_FLAG;
@@ -482,25 +485,25 @@ unsigned char *choose_random_location(unsigned long input,
 	initialize_identity_maps();
 
 	/* Record the various known unsafe memory ranges. */
-	mem_avoid_init(input, input_size, output);
+	mem_avoid_init(input, input_size, *output);
 
 	/* Walk e820 and find a random address. */
-	random_addr = find_random_phys_addr(output, output_size);
+	random_addr = find_random_phys_addr(*output, output_size);
 	if (!random_addr) {
 		warn("KASLR disabled: could not find suitable E820 region!");
-		goto out;
+	} else {
+		/* Update the new physical address location. */
+		if (*output != random_addr) {
+			add_identity_map(random_addr, output_size);
+			*output = random_addr;
+		}
 	}
 
-	/* Always enforce the minimum. */
-	if (random_addr < choice)
-		goto out;
-
-	choice = random_addr;
-
-	add_identity_map(choice, output_size);
-
 	/* This actually loads the identity pagetable on x86_64. */
 	finalize_identity_maps();
-out:
-	return (unsigned char *)choice;
+
+	/* Pick random virtual address starting from LOAD_PHYSICAL_ADDR. */
+	if (IS_ENABLED(CONFIG_X86_64))
+		random_addr = find_random_virt_addr(LOAD_PHYSICAL_ADDR, output_size);
+	*virt_addr = random_addr;
 }
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index f14db4e..b3c5a5f0 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -170,7 +170,8 @@ void __puthex(unsigned long value)
 }
 
 #if CONFIG_X86_NEED_RELOCS
-static void handle_relocations(void *output, unsigned long output_len)
+static void handle_relocations(void *output, unsigned long output_len,
+			       unsigned long virt_addr)
 {
 	int *reloc;
 	unsigned long delta, map, ptr;
@@ -182,11 +183,6 @@ static void handle_relocations(void *output, unsigned long output_len)
 	 * and where it was actually loaded.
 	 */
 	delta = min_addr - LOAD_PHYSICAL_ADDR;
-	if (!delta) {
-		debug_putstr("No relocation needed... ");
-		return;
-	}
-	debug_putstr("Performing relocations... ");
 
 	/*
 	 * The kernel contains a table of relocation addresses. Those
@@ -198,6 +194,20 @@ static void handle_relocations(void *output, unsigned long output_len)
 	map = delta - __START_KERNEL_map;
 
 	/*
+	 * 32-bit always performs relocations. 64-bit relocations are only
+	 * needed if KASLR has chosen a different starting address offset
+	 * from __START_KERNEL_map.
+	 */
+	if (IS_ENABLED(CONFIG_X86_64))
+		delta = virt_addr - LOAD_PHYSICAL_ADDR;
+
+	if (!delta) {
+		debug_putstr("No relocation needed... ");
+		return;
+	}
+	debug_putstr("Performing relocations... ");
+
+	/*
 	 * Process relocations: 32 bit relocations first then 64 bit after.
 	 * Three sets of binary relocations are added to the end of the kernel
 	 * before compression. Each relocation table entry is the kernel
@@ -250,7 +260,8 @@ static void handle_relocations(void *output, unsigned long output_len)
 #endif
 }
 #else
-static inline void handle_relocations(void *output, unsigned long output_len)
+static inline void handle_relocations(void *output, unsigned long output_len,
+				      unsigned long virt_addr)
 { }
 #endif
 
@@ -327,7 +338,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
 				  unsigned long output_len)
 {
 	const unsigned long kernel_total_size = VO__end - VO__text;
-	unsigned char *output_orig = output;
+	unsigned long virt_addr = (unsigned long)output;
 
 	/* Retain x86 boot parameters pointer passed from startup_32/64. */
 	boot_params = rmode;
@@ -366,13 +377,16 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
 	 * the entire decompressed kernel plus relocation table, or the
 	 * entire decompressed kernel plus .bss and .brk sections.
 	 */
-	output = choose_random_location((unsigned long)input_data, input_len,
-					(unsigned long)output,
-					max(output_len, kernel_total_size));
+	choose_random_location((unsigned long)input_data, input_len,
+				(unsigned long *)&output,
+				max(output_len, kernel_total_size),
+				&virt_addr);
 
 	/* Validate memory location choices. */
 	if ((unsigned long)output & (MIN_KERNEL_ALIGN - 1))
-		error("Destination address inappropriately aligned");
+		error("Destination physical address inappropriately aligned");
+	if (virt_addr & (MIN_KERNEL_ALIGN - 1))
+		error("Destination virtual address inappropriately aligned");
 #ifdef CONFIG_X86_64
 	if (heap > 0x3fffffffffffUL)
 		error("Destination address too large");
@@ -382,19 +396,16 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap,
 #endif
 #ifndef CONFIG_RELOCATABLE
 	if ((unsigned long)output != LOAD_PHYSICAL_ADDR)
-		error("Wrong destination address");
+		error("Destination address does not match LOAD_PHYSICAL_ADDR");
+	if ((unsigned long)output != virt_addr)
+		error("Destination virtual address changed when not relocatable");
 #endif
 
 	debug_putstr("\nDecompressing Linux... ");
 	__decompress(input_data, input_len, NULL, NULL, output, output_len,
 			NULL, error);
 	parse_elf(output);
-	/*
-	 * 32-bit always performs relocations. 64-bit relocations are only
-	 * needed if kASLR has chosen a different load address.
-	 */
-	if (!IS_ENABLED(CONFIG_X86_64) || output != output_orig)
-		handle_relocations(output, output_len);
+	handle_relocations(output, output_len, virt_addr);
 	debug_putstr("done.\nBooting the kernel.\n");
 	return output;
 }
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index 09c4ddd..1c8355e 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -67,20 +67,22 @@ int cmdline_find_option_bool(const char *option);
 
 #if CONFIG_RANDOMIZE_BASE
 /* kaslr.c */
-unsigned char *choose_random_location(unsigned long input_ptr,
-				      unsigned long input_size,
-				      unsigned long output_ptr,
-				      unsigned long output_size);
+void choose_random_location(unsigned long input,
+			    unsigned long input_size,
+			    unsigned long *output,
+			    unsigned long output_size,
+			    unsigned long *virt_addr);
 /* cpuflags.c */
 bool has_cpuflag(int flag);
 #else
-static inline
-unsigned char *choose_random_location(unsigned long input_ptr,
-				      unsigned long input_size,
-				      unsigned long output_ptr,
-				      unsigned long output_size)
+static inline void choose_random_location(unsigned long input,
+					  unsigned long input_size,
+					  unsigned long *output,
+					  unsigned long output_size,
+					  unsigned long *virt_addr)
 {
-	return (unsigned char *)output_ptr;
+	/* No change from existing output location. */
+	*virt_addr = *output;
 }
 #endif
 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip:x86/boot] x86/KASLR: Extend kernel image physical address randomization to addresses larger than 4G
  2016-05-25 22:45 ` [PATCH v9 4/5] x86/KASLR: Add physical address randomization >4G Kees Cook
  2016-06-17 12:23   ` [tip:x86/boot] x86/KASLR: Extend kernel image physical address randomization to addresses larger than 4G tip-bot for Kees Cook
@ 2016-06-26 11:02   ` tip-bot for Kees Cook
  1 sibling, 0 replies; 21+ messages in thread
From: tip-bot for Kees Cook @ 2016-06-26 11:02 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, mingo, luto, hjl.tools, dvlasenk, dvyukov, peterz, brgerst,
	aryabinin, keescook, yinghai, tglx, bp, torvalds, akpm, bhe,
	jpoimboe, linux-kernel

Commit-ID:  ed9f007ee68478f6a50ec9971ade25a0129a5c0e
Gitweb:     http://git.kernel.org/tip/ed9f007ee68478f6a50ec9971ade25a0129a5c0e
Author:     Kees Cook <keescook@chromium.org>
AuthorDate: Wed, 25 May 2016 15:45:33 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sun, 26 Jun 2016 12:32:05 +0200

x86/KASLR: Extend kernel image physical address randomization to addresses larger than 4G

We want the physical address to be randomized anywhere between
16MB and the top of physical memory (up to 64TB).

This patch exchanges the prior slots[] array for the new slot_areas[]
array, and lifts the limitation of KERNEL_IMAGE_SIZE on the physical
address offset for 64-bit. As before, process_e820_entry() walks
memory and populates slot_areas[], splitting on any detected mem_avoid
collisions.

Finally, since the slots[] array and its associated functions are not
needed any more, so they are removed.

Based on earlier patches by Baoquan He.

Originally-from: Baoquan He <bhe@redhat.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1464216334-17200-5-git-send-email-keescook@chromium.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/Kconfig                 |  27 +++++----
 arch/x86/boot/compressed/kaslr.c | 115 +++++++++++++++++++++++----------------
 2 files changed, 85 insertions(+), 57 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 0a7b885..770ae52 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1934,21 +1934,26 @@ config RANDOMIZE_BASE
 	  attempts relying on knowledge of the location of kernel
 	  code internals.
 
-	  The kernel physical and virtual address can be randomized
-	  from 16MB up to 1GB on 64-bit and 512MB on 32-bit. (Note that
-	  using RANDOMIZE_BASE reduces the memory space available to
-	  kernel modules from 1.5GB to 1GB.)
+	  On 64-bit, the kernel physical and virtual addresses are
+	  randomized separately. The physical address will be anywhere
+	  between 16MB and the top of physical memory (up to 64TB). The
+	  virtual address will be randomized from 16MB up to 1GB (9 bits
+	  of entropy). Note that this also reduces the memory space
+	  available to kernel modules from 1.5GB to 1GB.
+
+	  On 32-bit, the kernel physical and virtual addresses are
+	  randomized together. They will be randomized from 16MB up to
+	  512MB (8 bits of entropy).
 
 	  Entropy is generated using the RDRAND instruction if it is
 	  supported. If RDTSC is supported, its value is mixed into
 	  the entropy pool as well. If neither RDRAND nor RDTSC are
-	  supported, then entropy is read from the i8254 timer.
-
-	  Since the kernel is built using 2GB addressing, and
-	  PHYSICAL_ALIGN must be at a minimum of 2MB, only 10 bits of
-	  entropy is theoretically possible. Currently, with the
-	  default value for PHYSICAL_ALIGN and due to page table
-	  layouts, 64-bit uses 9 bits of entropy and 32-bit uses 8 bits.
+	  supported, then entropy is read from the i8254 timer. The
+	  usable entropy is limited by the kernel being built using
+	  2GB addressing, and that PHYSICAL_ALIGN must be at a
+	  minimum of 2MB. As a result, only 10 bits of entropy are
+	  theoretically possible, but the implementations are further
+	  limited due to memory layouts.
 
 	  If CONFIG_HIBERNATE is also enabled, KASLR is disabled at boot
 	  time. To enable it, boot with "kaslr" on the kernel command
diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 5550546..36e2811 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -132,17 +132,6 @@ enum mem_avoid_index {
 
 static struct mem_vector mem_avoid[MEM_AVOID_MAX];
 
-static bool mem_contains(struct mem_vector *region, struct mem_vector *item)
-{
-	/* Item at least partially before region. */
-	if (item->start < region->start)
-		return false;
-	/* Item at least partially after region. */
-	if (item->start + item->size > region->start + region->size)
-		return false;
-	return true;
-}
-
 static bool mem_overlaps(struct mem_vector *one, struct mem_vector *two)
 {
 	/* Item one is entirely before item two. */
@@ -319,8 +308,6 @@ static bool mem_avoid_overlap(struct mem_vector *img,
 	return is_overlapping;
 }
 
-static unsigned long slots[KERNEL_IMAGE_SIZE / CONFIG_PHYSICAL_ALIGN];
-
 struct slot_area {
 	unsigned long addr;
 	int num;
@@ -351,36 +338,44 @@ static void store_slot_info(struct mem_vector *region, unsigned long image_size)
 	}
 }
 
-static void slots_append(unsigned long addr)
-{
-	/* Overflowing the slots list should be impossible. */
-	if (slot_max >= KERNEL_IMAGE_SIZE / CONFIG_PHYSICAL_ALIGN)
-		return;
-
-	slots[slot_max++] = addr;
-}
-
 static unsigned long slots_fetch_random(void)
 {
+	unsigned long slot;
+	int i;
+
 	/* Handle case of no slots stored. */
 	if (slot_max == 0)
 		return 0;
 
-	return slots[get_random_long("Physical") % slot_max];
+	slot = get_random_long("Physical") % slot_max;
+
+	for (i = 0; i < slot_area_index; i++) {
+		if (slot >= slot_areas[i].num) {
+			slot -= slot_areas[i].num;
+			continue;
+		}
+		return slot_areas[i].addr + slot * CONFIG_PHYSICAL_ALIGN;
+	}
+
+	if (i == slot_area_index)
+		debug_putstr("slots_fetch_random() failed!?\n");
+	return 0;
 }
 
 static void process_e820_entry(struct e820entry *entry,
 			       unsigned long minimum,
 			       unsigned long image_size)
 {
-	struct mem_vector region, img, overlap;
+	struct mem_vector region, overlap;
+	struct slot_area slot_area;
+	unsigned long start_orig;
 
 	/* Skip non-RAM entries. */
 	if (entry->type != E820_RAM)
 		return;
 
-	/* Ignore entries entirely above our maximum. */
-	if (entry->addr >= KERNEL_IMAGE_SIZE)
+	/* On 32-bit, ignore entries entirely above our maximum. */
+	if (IS_ENABLED(CONFIG_X86_32) && entry->addr >= KERNEL_IMAGE_SIZE)
 		return;
 
 	/* Ignore entries entirely below our minimum. */
@@ -390,31 +385,55 @@ static void process_e820_entry(struct e820entry *entry,
 	region.start = entry->addr;
 	region.size = entry->size;
 
-	/* Potentially raise address to minimum location. */
-	if (region.start < minimum)
-		region.start = minimum;
+	/* Give up if slot area array is full. */
+	while (slot_area_index < MAX_SLOT_AREA) {
+		start_orig = region.start;
 
-	/* Potentially raise address to meet alignment requirements. */
-	region.start = ALIGN(region.start, CONFIG_PHYSICAL_ALIGN);
+		/* Potentially raise address to minimum location. */
+		if (region.start < minimum)
+			region.start = minimum;
 
-	/* Did we raise the address above the bounds of this e820 region? */
-	if (region.start > entry->addr + entry->size)
-		return;
+		/* Potentially raise address to meet alignment needs. */
+		region.start = ALIGN(region.start, CONFIG_PHYSICAL_ALIGN);
 
-	/* Reduce size by any delta from the original address. */
-	region.size -= region.start - entry->addr;
+		/* Did we raise the address above this e820 region? */
+		if (region.start > entry->addr + entry->size)
+			return;
 
-	/* Reduce maximum size to fit end of image within maximum limit. */
-	if (region.start + region.size > KERNEL_IMAGE_SIZE)
-		region.size = KERNEL_IMAGE_SIZE - region.start;
+		/* Reduce size by any delta from the original address. */
+		region.size -= region.start - start_orig;
 
-	/* Walk each aligned slot and check for avoided areas. */
-	for (img.start = region.start, img.size = image_size ;
-	     mem_contains(&region, &img) ;
-	     img.start += CONFIG_PHYSICAL_ALIGN) {
-		if (mem_avoid_overlap(&img, &overlap))
-			continue;
-		slots_append(img.start);
+		/* On 32-bit, reduce region size to fit within max size. */
+		if (IS_ENABLED(CONFIG_X86_32) &&
+		    region.start + region.size > KERNEL_IMAGE_SIZE)
+			region.size = KERNEL_IMAGE_SIZE - region.start;
+
+		/* Return if region can't contain decompressed kernel */
+		if (region.size < image_size)
+			return;
+
+		/* If nothing overlaps, store the region and return. */
+		if (!mem_avoid_overlap(&region, &overlap)) {
+			store_slot_info(&region, image_size);
+			return;
+		}
+
+		/* Store beginning of region if holds at least image_size. */
+		if (overlap.start > region.start + image_size) {
+			struct mem_vector beginning;
+
+			beginning.start = region.start;
+			beginning.size = overlap.start - region.start;
+			store_slot_info(&beginning, image_size);
+		}
+
+		/* Return if overlap extends to or past end of region. */
+		if (overlap.start + overlap.size >= region.start + region.size)
+			return;
+
+		/* Clip off the overlapping region and start over. */
+		region.size -= overlap.start - region.start + overlap.size;
+		region.start = overlap.start + overlap.size;
 	}
 }
 
@@ -431,6 +450,10 @@ static unsigned long find_random_phys_addr(unsigned long minimum,
 	for (i = 0; i < boot_params->e820_entries; i++) {
 		process_e820_entry(&boot_params->e820_map[i], minimum,
 				   image_size);
+		if (slot_area_index == MAX_SLOT_AREA) {
+			debug_putstr("Aborted e820 scan (slot_areas full)!\n");
+			break;
+		}
 	}
 
 	return slots_fetch_random();

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip:x86/boot] x86/KASLR: Allow randomization below the load address
  2016-05-25 22:45 ` [PATCH v9 5/5] x86/KASLR: Allow randomization below load address Kees Cook
  2016-06-17  8:47   ` Ingo Molnar
  2016-06-17 12:23   ` [tip:x86/boot] x86/KASLR: Allow randomization below the " tip-bot for Yinghai Lu
@ 2016-06-26 11:03   ` tip-bot for Yinghai Lu
  2 siblings, 0 replies; 21+ messages in thread
From: tip-bot for Yinghai Lu @ 2016-06-26 11:03 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: keescook, akpm, aryabinin, bhe, hpa, torvalds, tglx, mingo,
	jpoimboe, peterz, dvyukov, hjl.tools, brgerst, bp, yinghai,
	linux-kernel, luto, dvlasenk

Commit-ID:  e066cc47776a89bbdaf4184c0e75f7d389f9ab48
Gitweb:     http://git.kernel.org/tip/e066cc47776a89bbdaf4184c0e75f7d389f9ab48
Author:     Yinghai Lu <yinghai@kernel.org>
AuthorDate: Wed, 25 May 2016 15:45:34 -0700
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Sun, 26 Jun 2016 12:32:05 +0200

x86/KASLR: Allow randomization below the load address

Currently the kernel image physical address randomization's lower
boundary is the original kernel load address.

For bootloaders that load kernels into very high memory (e.g. kexec),
this means randomization takes place in a very small window at the
top of memory, ignoring the large region of physical memory below
the load address.

Since mem_avoid[] is already correctly tracking the regions that must be
avoided, this patch changes the minimum address to whatever is less:
512M (to conservatively avoid unknown things in lower memory) or the
load address. Now, for example, if the kernel is loaded at 8G, [512M,
8G) will be added to the list of possible physical memory positions.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
[ Rewrote the changelog, refactored the code to use min(). ]
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1464216334-17200-6-git-send-email-keescook@chromium.org
[ Edited the changelog some more, plus the code comment as well. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 arch/x86/boot/compressed/kaslr.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 36e2811..749c9e0 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -492,7 +492,7 @@ void choose_random_location(unsigned long input,
 			    unsigned long output_size,
 			    unsigned long *virt_addr)
 {
-	unsigned long random_addr;
+	unsigned long random_addr, min_addr;
 
 	/* By default, keep output position unchanged. */
 	*virt_addr = *output;
@@ -510,8 +510,15 @@ void choose_random_location(unsigned long input,
 	/* Record the various known unsafe memory ranges. */
 	mem_avoid_init(input, input_size, *output);
 
+	/*
+	 * Low end of the randomization range should be the
+	 * smaller of 512M or the initial kernel image
+	 * location:
+	 */
+	min_addr = min(*output, 512UL << 20);
+
 	/* Walk e820 and find a random address. */
-	random_addr = find_random_phys_addr(*output, output_size);
+	random_addr = find_random_phys_addr(min_addr, output_size);
 	if (!random_addr) {
 		warn("KASLR disabled: could not find suitable E820 region!");
 	} else {

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2016-06-26 11:04 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-25 22:45 [PATCH v9 0/5] x86/KASLR: Randomize virtual address separately Kees Cook
2016-05-25 22:45 ` [PATCH v9 1/5] x86/boot: Refuse to build with data relocations Kees Cook
2016-06-17 12:22   ` [tip:x86/boot] " tip-bot for Kees Cook
2016-06-26 11:01   ` tip-bot for Kees Cook
2016-05-25 22:45 ` [PATCH v9 2/5] x86/KASLR: Clarify identity map interface Kees Cook
2016-06-17 12:22   ` [tip:x86/boot] " tip-bot for Kees Cook
2016-06-26 11:02   ` tip-bot for Kees Cook
2016-05-25 22:45 ` [PATCH v9 3/5] x86/KASLR: Randomize virtual address separately Kees Cook
2016-06-17  8:20   ` Ingo Molnar
2016-06-17  8:35     ` Ingo Molnar
2016-06-17 12:22   ` [tip:x86/boot] " tip-bot for Baoquan He
2016-06-26 11:02   ` tip-bot for Baoquan He
2016-05-25 22:45 ` [PATCH v9 4/5] x86/KASLR: Add physical address randomization >4G Kees Cook
2016-06-17 12:23   ` [tip:x86/boot] x86/KASLR: Extend kernel image physical address randomization to addresses larger than 4G tip-bot for Kees Cook
2016-06-26 11:02   ` tip-bot for Kees Cook
2016-05-25 22:45 ` [PATCH v9 5/5] x86/KASLR: Allow randomization below load address Kees Cook
2016-06-17  8:47   ` Ingo Molnar
2016-06-17 15:44     ` Kees Cook
2016-06-17 18:44       ` Yinghai Lu
2016-06-17 12:23   ` [tip:x86/boot] x86/KASLR: Allow randomization below the " tip-bot for Yinghai Lu
2016-06-26 11:03   ` tip-bot for Yinghai Lu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).