linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/3] Determine kernel image mapping size at runtime for x86_64
@ 2017-01-04  8:37 Baoquan He
  2017-01-04  8:37 ` [PATCH v3 1/3] x86/64: Make kernel text mapping always take one whole page table in early boot code Baoquan He
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Baoquan He @ 2017-01-04  8:37 UTC (permalink / raw)
  To: tglx, hpa, mingo
  Cc: linux-kernel, x86, keescook, yinghai, bp, thgarnie, kuleshovmail,
	luto, mcgrof, anderson, dyoung, Baoquan He

Kernel behaviour is inconsistent between KASLR code is not compiled in
with CONFIG_RANDOMIZE_BASE disabled and user specifies "nokaslr" when
KASLR code compiled in with CONFIG_RANDOMIZE_BASE enabled. As long as
CONFIG_RANDOMIZE_BASE is enabled, kernel mapping size will be extended
up another 512M to 1G though "nokaslr" is specified explicitly. This is
buggy. CONFIG_RANDOMIZE_BASE should only decide if KASLR code need be
compiled in. If user specify "nokaslr", the kernel have to behave as no
KASLR code compiled in at all as expected.

The root cause of the inconsistency is the size of kernel image mapping
area is fixed at compiling time, and is changed from 512M to 1G as long
as CONFIG_RANDOMIZE_BASE is enabled.

So in this patchset, change to determine the size of kernel image mapping
area at runtime. Though KASLR code compiled in, if "nokaslr" specified,
still make kernel mapping size be 512M.

v1->v2:
    Kbuild test threw build warning because of code bug.

v2->v3:
    Boris pointed out patch log is not good for reviewing and understanding.
    So split the old patch 2/2 into 2 parts and rewrite the patch log,
    patch 2/3 is introducing the new constant KERNEL_MAPPING_SIZE which
    differs from the old KERNEL_IMAGE_SIZE, patch 3/3 gets the kernel mapping
    size at runtime.

Baoquan He (3):
  x86/64: Make kernel text mapping always take one whole page table in
    early boot code
  x86/64: Introduce a new constant KERNEL_MAPPING_SIZE
  x86/64/KASLR: Determine the kernel mapping size at run time

 arch/x86/boot/compressed/kaslr.c        | 20 +++++++++++++++-----
 arch/x86/include/asm/kaslr.h            |  1 +
 arch/x86/include/asm/page_64_types.h    | 20 ++++++++++++--------
 arch/x86/include/asm/pgtable_64_types.h |  2 +-
 arch/x86/kernel/head64.c                | 11 ++++++-----
 arch/x86/kernel/head_64.S               | 16 +++++++++-------
 arch/x86/mm/dump_pagetables.c           |  3 ++-
 arch/x86/mm/init_64.c                   |  2 +-
 arch/x86/mm/physaddr.c                  |  6 +++---
 9 files changed, 50 insertions(+), 31 deletions(-)

-- 
2.5.5

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/3] x86/64: Make kernel text mapping always take one whole page table in early boot code
  2017-01-04  8:37 [PATCH v3 0/3] Determine kernel image mapping size at runtime for x86_64 Baoquan He
@ 2017-01-04  8:37 ` Baoquan He
  2017-01-04 13:00   ` Boris Petkov
  2017-01-04  8:37 ` [PATCH v3 2/3] x86/64: Introduce a new constant KERNEL_MAPPING_SIZE Baoquan He
  2017-01-04  8:37 ` [PATCH v3 3/3] x86/64/KASLR: Determine the kernel mapping size at run time Baoquan He
  2 siblings, 1 reply; 10+ messages in thread
From: Baoquan He @ 2017-01-04  8:37 UTC (permalink / raw)
  To: tglx, hpa, mingo
  Cc: linux-kernel, x86, keescook, yinghai, bp, thgarnie, kuleshovmail,
	luto, mcgrof, anderson, dyoung, Baoquan He

In early boot code level2_kernel_pgt is used to map kernel text. And its
size varies with KERNEL_IMAGE_SIZE and fixed at compiling time. In fact
we can make it always take 512 entries of one whole page table, because
later function cleanup_highmap will clean up the unused entries. With the
help of this change kernel text mapping size can be decided at runtime
later, 512M if kaslr is disabled, 1G if kaslr is enabled.

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/include/asm/page_64_types.h |  3 ++-
 arch/x86/kernel/head_64.S            | 15 ++++++++-------
 arch/x86/mm/init_64.c                |  2 +-
 3 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index 9215e05..62a20ea 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -56,8 +56,9 @@
  * are fully set up. If kernel ASLR is configured, it can extend the
  * kernel page table mapping, reducing the size of the modules area.
  */
+#define KERNEL_MAPPING_SIZE_EXT	(1024 * 1024 * 1024)
 #if defined(CONFIG_RANDOMIZE_BASE)
-#define KERNEL_IMAGE_SIZE	(1024 * 1024 * 1024)
+#define KERNEL_IMAGE_SIZE	KERNEL_MAPPING_SIZE_EXT
 #else
 #define KERNEL_IMAGE_SIZE	(512 * 1024 * 1024)
 #endif
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index b467b14..03bcb67 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -458,17 +458,18 @@ NEXT_PAGE(level3_kernel_pgt)
 
 NEXT_PAGE(level2_kernel_pgt)
 	/*
-	 * 512 MB kernel mapping. We spend a full page on this pagetable
-	 * anyway.
+	 * Kernel image size is limited to 512 MB. The kernel code+data+bss
+	 * must not be bigger than that.
 	 *
-	 * The kernel code+data+bss must not be bigger than that.
+	 * We spend a full page on this pagetable anyway, so take the whole
+	 * page here so that the kernel mapping size can be decided at runtime,
+	 * 512M if no kaslr, 1G if kaslr enabled. Later cleanup_highmap will
+	 * clean up those unused entries.
 	 *
-	 * (NOTE: at +512MB starts the module area, see MODULES_VADDR.
-	 *  If you want to increase this then increase MODULES_VADDR
-	 *  too.)
+	 * The module area starts after kernel mapping area.
 	 */
 	PMDS(0, __PAGE_KERNEL_LARGE_EXEC,
-		KERNEL_IMAGE_SIZE/PMD_SIZE)
+		PTRS_PER_PMD)
 
 NEXT_PAGE(level2_fixmap_pgt)
 	.fill	506,8,0
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index af85b68..45ef0ff 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -297,7 +297,7 @@ void __init init_extra_mapping_uc(unsigned long phys, unsigned long size)
 void __init cleanup_highmap(void)
 {
 	unsigned long vaddr = __START_KERNEL_map;
-	unsigned long vaddr_end = __START_KERNEL_map + KERNEL_IMAGE_SIZE;
+	unsigned long vaddr_end = __START_KERNEL_map + KERNEL_MAPPING_SIZE_EXT;
 	unsigned long end = roundup((unsigned long)_brk_end, PMD_SIZE) - 1;
 	pmd_t *pmd = level2_kernel_pgt;
 
-- 
2.5.5

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 2/3] x86/64: Introduce a new constant KERNEL_MAPPING_SIZE
  2017-01-04  8:37 [PATCH v3 0/3] Determine kernel image mapping size at runtime for x86_64 Baoquan He
  2017-01-04  8:37 ` [PATCH v3 1/3] x86/64: Make kernel text mapping always take one whole page table in early boot code Baoquan He
@ 2017-01-04  8:37 ` Baoquan He
  2017-01-04  8:37 ` [PATCH v3 3/3] x86/64/KASLR: Determine the kernel mapping size at run time Baoquan He
  2 siblings, 0 replies; 10+ messages in thread
From: Baoquan He @ 2017-01-04  8:37 UTC (permalink / raw)
  To: tglx, hpa, mingo
  Cc: linux-kernel, x86, keescook, yinghai, bp, thgarnie, kuleshovmail,
	luto, mcgrof, anderson, dyoung, Baoquan He

In x86, KERNEL_IMAGE_SIZE is used to limit the size of kernel image in
running space, but also represents the size of kernel image mapping area.
This looks good when kernel virtual address is invariable inside 512M
area and kernel image size is not bigger than 512M.

Along with the adding of kaslr, in x86_64 the area of kernel mapping is
extended up another 512M. It becomes improper to let KERNEL_IMAGE_SIZE
alone still play two roles now.

So introduce a new constant KERNEL_MAPPING_SIZE to represent the size of
kernel mapping area in x86_64. Let KERNEL_IMAGE_SIZE be as its name is
saying.

In this patch, just add KERNEL_MAPPING_SIZE and replace KERNEL_IMAGE_SIZE
with it in the relevant places. No functional change.

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/boot/compressed/kaslr.c        | 20 +++++++++++++++-----
 arch/x86/include/asm/page_64_types.h    | 17 ++++++++++++-----
 arch/x86/include/asm/pgtable_64_types.h |  2 +-
 arch/x86/kernel/head_64.S               |  3 ++-
 arch/x86/mm/physaddr.c                  |  6 +++---
 5 files changed, 33 insertions(+), 15 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index a66854d..6ded03b 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -22,6 +22,12 @@
 static const char build_str[] = UTS_RELEASE " (" LINUX_COMPILE_BY "@"
 		LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION;
 
+/*
+ * By default, the size of kernel text mapping equals KERNEL_IMAGE_SIZE.
+ * While x86_64 may extend it to 1G if KASLR is enabled.
+ */
+static unsigned long _kernel_mapping_size = KERNEL_IMAGE_SIZE;
+
 static unsigned long rotate_xor(unsigned long hash, const void *area,
 				size_t size)
 {
@@ -311,7 +317,7 @@ static void process_e820_entry(struct e820entry *entry,
 		return;
 
 	/* On 32-bit, ignore entries entirely above our maximum. */
-	if (IS_ENABLED(CONFIG_X86_32) && entry->addr >= KERNEL_IMAGE_SIZE)
+	if (IS_ENABLED(CONFIG_X86_32) && entry->addr >= _kernel_mapping_size)
 		return;
 
 	/* Ignore entries entirely below our minimum. */
@@ -341,8 +347,8 @@ static void process_e820_entry(struct e820entry *entry,
 
 		/* On 32-bit, reduce region size to fit within max size. */
 		if (IS_ENABLED(CONFIG_X86_32) &&
-		    region.start + region.size > KERNEL_IMAGE_SIZE)
-			region.size = KERNEL_IMAGE_SIZE - region.start;
+		    region.start + region.size > _kernel_mapping_size)
+			region.size = _kernel_mapping_size - region.start;
 
 		/* Return if region can't contain decompressed kernel */
 		if (region.size < image_size)
@@ -408,9 +414,9 @@ static unsigned long find_random_virt_addr(unsigned long minimum,
 	/*
 	 * There are how many CONFIG_PHYSICAL_ALIGN-sized slots
 	 * that can hold image_size within the range of minimum to
-	 * KERNEL_IMAGE_SIZE?
+	 * _kernel_mapping_size?
 	 */
-	slots = (KERNEL_IMAGE_SIZE - minimum - image_size) /
+	slots = (_kernel_mapping_size - minimum - image_size) /
 		 CONFIG_PHYSICAL_ALIGN + 1;
 
 	random_addr = kaslr_get_random_long("Virtual") % slots;
@@ -438,6 +444,10 @@ void choose_random_location(unsigned long input,
 		return;
 	}
 
+#ifdef CONFIG_X86_64
+	_kernel_mapping_size = KERNEL_MAPPING_SIZE;
+#endif
+
 	boot_params->hdr.loadflags |= KASLR_FLAG;
 
 	/* Prepare to add new identity pagetables on demand. */
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index 62a20ea..20a5a9b 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -49,18 +49,25 @@
 #define __PHYSICAL_MASK_SHIFT	46
 #define __VIRTUAL_MASK_SHIFT	47
 
+
+/*
+ * Kernel image size is limited to 512 MB. The kernel code+data+bss
+ * must not be bigger than that.
+ */
+#define KERNEL_IMAGE_SIZE	(512 * 1024 * 1024)
+
 /*
- * Kernel image size is limited to 1GiB due to the fixmap living in the
- * next 1GiB (see level2_kernel_pgt in arch/x86/kernel/head_64.S). Use
- * 512MiB by default, leaving 1.5GiB for modules once the page tables
+ * Kernel mapping size is limited to 1GiB due to the fixmap living in
+ * the next 1GiB (see level2_kernel_pgt in arch/x86/kernel/head_64.S).
+ * Use 512MiB by default, leaving 1.5GiB for modules once the page tables
  * are fully set up. If kernel ASLR is configured, it can extend the
  * kernel page table mapping, reducing the size of the modules area.
  */
 #define KERNEL_MAPPING_SIZE_EXT	(1024 * 1024 * 1024)
 #if defined(CONFIG_RANDOMIZE_BASE)
-#define KERNEL_IMAGE_SIZE	KERNEL_MAPPING_SIZE_EXT
+#define KERNEL_MAPPING_SIZE	KERNEL_MAPPING_SIZE_EXT
 #else
-#define KERNEL_IMAGE_SIZE	(512 * 1024 * 1024)
+#define KERNEL_MAPPING_SIZE	KERNEL_IMAGE_SIZE
 #endif
 
 #endif /* _ASM_X86_PAGE_64_DEFS_H */
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 3a26420..a357050 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -66,7 +66,7 @@ typedef struct { pteval_t pte; } pte_t;
 #define VMEMMAP_START	__VMEMMAP_BASE
 #endif /* CONFIG_RANDOMIZE_MEMORY */
 #define VMALLOC_END	(VMALLOC_START + _AC((VMALLOC_SIZE_TB << 40) - 1, UL))
-#define MODULES_VADDR    (__START_KERNEL_map + KERNEL_IMAGE_SIZE)
+#define MODULES_VADDR    (__START_KERNEL_map + KERNEL_MAPPING_SIZE)
 #define MODULES_END      _AC(0xffffffffff000000, UL)
 #define MODULES_LEN   (MODULES_END - MODULES_VADDR)
 #define ESPFIX_PGD_ENTRY _AC(-2, UL)
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 03bcb67..7446055 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -466,7 +466,8 @@ NEXT_PAGE(level2_kernel_pgt)
 	 * 512M if no kaslr, 1G if kaslr enabled. Later cleanup_highmap will
 	 * clean up those unused entries.
 	 *
-	 * The module area starts after kernel mapping area.
+	 * The module area starts after kernel mapping area, see MODULES_VADDR.
+	 * It will vary with KERNEL_MAPPING_SIZE.
 	 */
 	PMDS(0, __PAGE_KERNEL_LARGE_EXEC,
 		PTRS_PER_PMD)
diff --git a/arch/x86/mm/physaddr.c b/arch/x86/mm/physaddr.c
index cfc3b91..c0b70fc 100644
--- a/arch/x86/mm/physaddr.c
+++ b/arch/x86/mm/physaddr.c
@@ -18,7 +18,7 @@ unsigned long __phys_addr(unsigned long x)
 	if (unlikely(x > y)) {
 		x = y + phys_base;
 
-		VIRTUAL_BUG_ON(y >= KERNEL_IMAGE_SIZE);
+		VIRTUAL_BUG_ON(y >= KERNEL_MAPPING_SIZE);
 	} else {
 		x = y + (__START_KERNEL_map - PAGE_OFFSET);
 
@@ -35,7 +35,7 @@ unsigned long __phys_addr_symbol(unsigned long x)
 	unsigned long y = x - __START_KERNEL_map;
 
 	/* only check upper bounds since lower bounds will trigger carry */
-	VIRTUAL_BUG_ON(y >= KERNEL_IMAGE_SIZE);
+	VIRTUAL_BUG_ON(y >= KERNEL_MAPPING_SIZE);
 
 	return y + phys_base;
 }
@@ -50,7 +50,7 @@ bool __virt_addr_valid(unsigned long x)
 	if (unlikely(x > y)) {
 		x = y + phys_base;
 
-		if (y >= KERNEL_IMAGE_SIZE)
+		if (y >= KERNEL_MAPPING_SIZE)
 			return false;
 	} else {
 		x = y + (__START_KERNEL_map - PAGE_OFFSET);
-- 
2.5.5

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 3/3] x86/64/KASLR: Determine the kernel mapping size at run time
  2017-01-04  8:37 [PATCH v3 0/3] Determine kernel image mapping size at runtime for x86_64 Baoquan He
  2017-01-04  8:37 ` [PATCH v3 1/3] x86/64: Make kernel text mapping always take one whole page table in early boot code Baoquan He
  2017-01-04  8:37 ` [PATCH v3 2/3] x86/64: Introduce a new constant KERNEL_MAPPING_SIZE Baoquan He
@ 2017-01-04  8:37 ` Baoquan He
  2 siblings, 0 replies; 10+ messages in thread
From: Baoquan He @ 2017-01-04  8:37 UTC (permalink / raw)
  To: tglx, hpa, mingo
  Cc: linux-kernel, x86, keescook, yinghai, bp, thgarnie, kuleshovmail,
	luto, mcgrof, anderson, dyoung, Baoquan He

Kernel behaviour is inconsistent between KASLR code is not compiled in
with CONFIG_RANDOMIZE_BASE disabled and user specifies "nokaslr" with
CONFIG_RANDOMIZE_BASE enabled. As long as CONFIG_RANDOMIZE_BASE is
enabled, kernel mapping size will be extended up another 512M to 1G
though "nokaslr" is specified explicitly. This is buggy.
CONFIG_RANDOMIZE_BASE should only decide if KASLR code need be compiled
in. If user specify "nokaslr", the kernel have to behave as no KASLR
code compiled in at all as expected.

The root cause of the inconsistency is KERNEL_MAPPING_SIZE, the size of
kernel image mapping area, is fixed at compiling time, and is changed
from 512M to 1G as long as CONFIG_RANDOMIZE_BASE is enabled.

So in this patch, change to determine the size of kernel image mapping
area at runtime. Though KASLR code compiled in, if "nokaslr" specified,
still make kernel mapping size be 512M.

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
---
 arch/x86/boot/compressed/kaslr.c     |  2 +-
 arch/x86/include/asm/kaslr.h         |  1 +
 arch/x86/include/asm/page_64_types.h |  6 +-----
 arch/x86/kernel/head64.c             | 11 ++++++-----
 arch/x86/mm/dump_pagetables.c        |  3 ++-
 5 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index 6ded03b..823f294 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -445,7 +445,7 @@ void choose_random_location(unsigned long input,
 	}
 
 #ifdef CONFIG_X86_64
-	_kernel_mapping_size = KERNEL_MAPPING_SIZE;
+	_kernel_mapping_size = KERNEL_MAPPING_SIZE_EXT;
 #endif
 
 	boot_params->hdr.loadflags |= KASLR_FLAG;
diff --git a/arch/x86/include/asm/kaslr.h b/arch/x86/include/asm/kaslr.h
index 1052a79..093935d 100644
--- a/arch/x86/include/asm/kaslr.h
+++ b/arch/x86/include/asm/kaslr.h
@@ -2,6 +2,7 @@
 #define _ASM_KASLR_H_
 
 unsigned long kaslr_get_random_long(const char *purpose);
+extern unsigned long kernel_mapping_size;
 
 #ifdef CONFIG_RANDOMIZE_MEMORY
 extern unsigned long page_offset_base;
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index 20a5a9b..b8e79d7 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -64,10 +64,6 @@
  * kernel page table mapping, reducing the size of the modules area.
  */
 #define KERNEL_MAPPING_SIZE_EXT	(1024 * 1024 * 1024)
-#if defined(CONFIG_RANDOMIZE_BASE)
-#define KERNEL_MAPPING_SIZE	KERNEL_MAPPING_SIZE_EXT
-#else
-#define KERNEL_MAPPING_SIZE	KERNEL_IMAGE_SIZE
-#endif
+#define KERNEL_MAPPING_SIZE	kernel_mapping_size
 
 #endif /* _ASM_X86_PAGE_64_DEFS_H */
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 54a2372..46d2bd2 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -28,6 +28,7 @@
 #include <asm/bootparam_utils.h>
 #include <asm/microcode.h>
 #include <asm/kasan.h>
+#include <asm/cmdline.h>
 
 /*
  * Manage page tables very early on.
@@ -36,6 +37,7 @@ extern pgd_t early_level4_pgt[PTRS_PER_PGD];
 extern pmd_t early_dynamic_pgts[EARLY_DYNAMIC_PAGE_TABLES][PTRS_PER_PMD];
 static unsigned int __initdata next_early_pgt = 2;
 pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);
+unsigned long kernel_mapping_size = KERNEL_IMAGE_SIZE;
 
 /* Wipe all early page tables except for the kernel symbol map */
 static void __init reset_early_page_tables(void)
@@ -138,12 +140,7 @@ asmlinkage __visible void __init x86_64_start_kernel(char * real_mode_data)
 	 * Build-time sanity checks on the kernel image and module
 	 * area mappings. (these are purely build-time and produce no code)
 	 */
-	BUILD_BUG_ON(MODULES_VADDR < __START_KERNEL_map);
-	BUILD_BUG_ON(MODULES_VADDR - __START_KERNEL_map < KERNEL_IMAGE_SIZE);
-	BUILD_BUG_ON(MODULES_LEN + KERNEL_IMAGE_SIZE > 2*PUD_SIZE);
 	BUILD_BUG_ON((__START_KERNEL_map & ~PMD_MASK) != 0);
-	BUILD_BUG_ON((MODULES_VADDR & ~PMD_MASK) != 0);
-	BUILD_BUG_ON(!(MODULES_VADDR > __START_KERNEL));
 	BUILD_BUG_ON(!(((MODULES_END - 1) & PGDIR_MASK) ==
 				(__START_KERNEL & PGDIR_MASK)));
 	BUILD_BUG_ON(__fix_to_virt(__end_of_fixed_addresses) <= MODULES_END);
@@ -165,6 +162,10 @@ asmlinkage __visible void __init x86_64_start_kernel(char * real_mode_data)
 
 	copy_bootdata(__va(real_mode_data));
 
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) &&
+		!cmdline_find_option_bool(boot_command_line, "nokaslr"))
+		kernel_mapping_size = KERNEL_MAPPING_SIZE_EXT;
+
 	/*
 	 * Load microcode early on BSP.
 	 */
diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index ea9c49a..412c3f5 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -82,7 +82,7 @@ static struct addr_marker address_markers[] = {
 	{ EFI_VA_END,		"EFI Runtime Services" },
 # endif
 	{ __START_KERNEL_map,   "High Kernel Mapping" },
-	{ MODULES_VADDR,        "Modules" },
+	{ 0/*MODULES_VADDR*/,        "Modules" },
 	{ MODULES_END,          "End Modules" },
 #else
 	{ PAGE_OFFSET,          "Kernel Mapping" },
@@ -442,6 +442,7 @@ static int __init pt_dump_init(void)
 	address_markers[LOW_KERNEL_NR].start_address = PAGE_OFFSET;
 	address_markers[VMALLOC_START_NR].start_address = VMALLOC_START;
 	address_markers[VMEMMAP_START_NR].start_address = VMEMMAP_START;
+	address_markers[MODULES_VADDR_NR].start_address = MODULES_VADDR;
 #endif
 #ifdef CONFIG_X86_32
 	address_markers[VMALLOC_START_NR].start_address = VMALLOC_START;
-- 
2.5.5

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] x86/64: Make kernel text mapping always take one whole page table in early boot code
  2017-01-04  8:37 ` [PATCH v3 1/3] x86/64: Make kernel text mapping always take one whole page table in early boot code Baoquan He
@ 2017-01-04 13:00   ` Boris Petkov
  2017-01-05  3:28     ` Baoquan He
  0 siblings, 1 reply; 10+ messages in thread
From: Boris Petkov @ 2017-01-04 13:00 UTC (permalink / raw)
  To: Baoquan He, tglx, hpa, mingo
  Cc: linux-kernel, x86, keescook, yinghai, thgarnie, kuleshovmail,
	luto, mcgrof, anderson, dyoung

On January 4, 2017 10:37:31 AM GMT+02:00, Baoquan He <bhe@redhat.com> wrote:
>In early boot code level2_kernel_pgt is used to map kernel text. And
>its
>size varies with KERNEL_IMAGE_SIZE and fixed at compiling time. In fact
>we can make it always take 512 entries of one whole page table, because
>later function cleanup_highmap will clean up the unused entries. With
>the
>help of this change kernel text mapping size can be decided at runtime
>later, 512M if kaslr is disabled, 1G if kaslr is enabled.
Question: so why are we even having that distinction? Why aren't we making text mapping size 1G by default and be done with it?
-- 
Sent from a small device: formatting sux and brevity is inevitable. 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] x86/64: Make kernel text mapping always take one whole page table in early boot code
  2017-01-04 13:00   ` Boris Petkov
@ 2017-01-05  3:28     ` Baoquan He
  2017-01-05 14:01       ` Borislav Petkov
  0 siblings, 1 reply; 10+ messages in thread
From: Baoquan He @ 2017-01-05  3:28 UTC (permalink / raw)
  To: Boris Petkov, keescook
  Cc: tglx, hpa, mingo, linux-kernel, x86, yinghai, thgarnie,
	kuleshovmail, luto, mcgrof, anderson, dyoung

On 01/04/17 at 03:00pm, Boris Petkov wrote:
> On January 4, 2017 10:37:31 AM GMT+02:00, Baoquan He <bhe@redhat.com> wrote:
> >In early boot code level2_kernel_pgt is used to map kernel text. And
> >its
> >size varies with KERNEL_IMAGE_SIZE and fixed at compiling time. In fact
> >we can make it always take 512 entries of one whole page table, because
> >later function cleanup_highmap will clean up the unused entries. With
> >the
> >help of this change kernel text mapping size can be decided at runtime
> >later, 512M if kaslr is disabled, 1G if kaslr is enabled.

> Question: so why are we even having that distinction? Why aren't we making
> text mapping size 1G by default and be done with it?

Yes, good question, thanks!

Possibly people worry more that no enough space left for kernel modules
mapping whthin 1G, just a guess. I am fine with making text mapping size
1G by default. Kees must know more about the 1G only if kaslr enabled.

Hi Kees,

Could you help check if there is any risk making kernel mapping size 1G by
default as Boris suggested?

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] x86/64: Make kernel text mapping always take one whole page table in early boot code
  2017-01-05  3:28     ` Baoquan He
@ 2017-01-05 14:01       ` Borislav Petkov
  2017-01-05 19:35         ` Kees Cook
  0 siblings, 1 reply; 10+ messages in thread
From: Borislav Petkov @ 2017-01-05 14:01 UTC (permalink / raw)
  To: Baoquan He, H. Peter Anvin
  Cc: tglx, mingo, linux-kernel, x86, yinghai, thgarnie, kuleshovmail,
	luto, mcgrof, anderson, dyoung

On Thu, Jan 05, 2017 at 11:28:00AM +0800, Baoquan He wrote:
> Possibly people worry more that no enough space left for kernel modules
> mapping whthin 1G, just a guess. I am fine with making text mapping size
> 1G by default. Kees must know more about the 1G only if kaslr enabled.

So I'm thinking practically kaslr will be enabled on the majority
of the systems anyway so we will have 1G text mapping size on most.
The question is, are there any downsides/issues with making that the
default.

hpa, do you see any problems with it?

Thanks.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] x86/64: Make kernel text mapping always take one whole page table in early boot code
  2017-01-05 14:01       ` Borislav Petkov
@ 2017-01-05 19:35         ` Kees Cook
  2017-01-05 20:52           ` Borislav Petkov
  0 siblings, 1 reply; 10+ messages in thread
From: Kees Cook @ 2017-01-05 19:35 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Baoquan He, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, LKML,
	x86, Yinghai Lu, Thomas Garnier, Alexander Kuleshov,
	Andy Lutomirski, Luis R. Rodriguez, Dave Anderson, Dave Young

On Thu, Jan 5, 2017 at 6:01 AM, Borislav Petkov <bp@suse.de> wrote:
> On Thu, Jan 05, 2017 at 11:28:00AM +0800, Baoquan He wrote:
>> Possibly people worry more that no enough space left for kernel modules
>> mapping whthin 1G, just a guess. I am fine with making text mapping size
>> 1G by default. Kees must know more about the 1G only if kaslr enabled.
>
> So I'm thinking practically kaslr will be enabled on the majority
> of the systems anyway so we will have 1G text mapping size on most.
> The question is, are there any downsides/issues with making that the
> default.
>
> hpa, do you see any problems with it?

The only reason I had it as an option was for kernel module space. It
wasn't clear to me at the time if enough space remained for modules in
all use-cases. It seems like probably there is, so I have no objection
to making the mapping 1G unconditionally.

-Kees

-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] x86/64: Make kernel text mapping always take one whole page table in early boot code
  2017-01-05 19:35         ` Kees Cook
@ 2017-01-05 20:52           ` Borislav Petkov
  2017-01-06  9:35             ` Baoquan He
  0 siblings, 1 reply; 10+ messages in thread
From: Borislav Petkov @ 2017-01-05 20:52 UTC (permalink / raw)
  To: Kees Cook
  Cc: Baoquan He, H. Peter Anvin, Thomas Gleixner, Ingo Molnar, LKML,
	x86, Yinghai Lu, Thomas Garnier, Alexander Kuleshov,
	Andy Lutomirski, Luis R. Rodriguez, Dave Anderson, Dave Young

On Thu, Jan 05, 2017 at 11:35:57AM -0800, Kees Cook wrote:
> The only reason I had it as an option was for kernel module space. It
> wasn't clear to me at the time if enough space remained for modules in
> all use-cases. It seems like probably there is, so I have no objection
> to making the mapping 1G unconditionally.

Oh someone will crawl out of the woodwork handwaiving that 1G of modules
is not enough. But then that someone would have to choose between kaslr
and >1G modules.

Realistically, on a typical bigger machine, the modules take up
something like <10M:

$ lsmod | awk '{ sum +=$2 } END { print sum }'
7188480

so I'm not really worried if we reduce it by default to 1G. Besides, the
reduction has been there for a while now - since CONFIG_RANDOMIZE_BASE -
so we probably would've heard complaints already...

Btw, we should probably document the reduction in the va map document
too:

---
From: Borislav Petkov <bp@suse.de>
Date: Thu, 5 Jan 2017 21:47:18 +0100
Subject: [PATCH] x86/mm: Document modules space reduction

KASLR reduces module mapping space to 1G, document that.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 Documentation/x86/x86_64/mm.txt | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index 5724092db811..a737dfbc198b 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -39,6 +39,8 @@ memory window (this size is arbitrary, it can be raised later if needed).
 The mappings are not part of any other kernel PGD and are only available
 during EFI runtime calls.
 
+CONFIG_RANDOMIZE_BASE (KASLR) reduces module mapping space from 1.5G to 1G.
+
 Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all
 physical memory, vmalloc/ioremap space and virtual memory map are randomized.
 Their order is preserved but their base will be offset early at boot time.
-- 
2.11.0

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] x86/64: Make kernel text mapping always take one whole page table in early boot code
  2017-01-05 20:52           ` Borislav Petkov
@ 2017-01-06  9:35             ` Baoquan He
  0 siblings, 0 replies; 10+ messages in thread
From: Baoquan He @ 2017-01-06  9:35 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Kees Cook, H. Peter Anvin, Thomas Gleixner, LKML, x86,
	Yinghai Lu, Thomas Garnier, Alexander Kuleshov, Andy Lutomirski,
	Luis R. Rodriguez, Dave Anderson, Dave Young

On 01/05/17 at 09:52pm, Borislav Petkov wrote:
> On Thu, Jan 05, 2017 at 11:35:57AM -0800, Kees Cook wrote:
> > The only reason I had it as an option was for kernel module space. It
> > wasn't clear to me at the time if enough space remained for modules in
> > all use-cases. It seems like probably there is, so I have no objection
> > to making the mapping 1G unconditionally.
> 
> Oh someone will crawl out of the woodwork handwaiving that 1G of modules
> is not enough. But then that someone would have to choose between kaslr
> and >1G modules.
> 
> Realistically, on a typical bigger machine, the modules take up
> something like <10M:
> 
> $ lsmod | awk '{ sum +=$2 } END { print sum }'
> 7188480
> 
> so I'm not really worried if we reduce it by default to 1G. Besides, the
> reduction has been there for a while now - since CONFIG_RANDOMIZE_BASE -
> so we probably would've heard complaints already...

Fair enough, so worry about the space of kernel modules can be removed.

Now I am thinking if a new constant KERNEL_MAPPING_SIZE is still needed.
Below are commits changing value of KERNEL_IMAGE_SIZE made by Ingo.

~~~~~~~~
85eb69a1    x86: increase the kernel text limit to 512 MB
Ingo changed KERNEL_IMAGE_SIZE from 128M to 512M.

88f3aec7    x86: fix spontaneous reboot with allyesconfig bzImage
This changed KERNEL_IMAGE_SIZE from 40M to 128M. At that time it was
called KERNEL_TEXT_SIZE.
~~~~~~~~

All these changes are considered for increasing the kernel image limit.
As I said in patch 2/3 log, KERNEL_IMAGE_SIZE plays two parts, one is
limiting the size of kernel image, the other is representing the size of
kernel image mapping area. Before kernel mapping area was invariant,
increasing kernel image size means enlarging kernel mapping area, this
is fine. If kernel can be mapped into 1G space, id doesn't mean kernel
image is allowed to be 1G. Now linker will check the size of kernel
image in kernel/vmlinux.lds.S. Maybe I need keep the patch 2/3 and make
KERNEL_MAPPING_SIZE default to 1G. 

Any thoughts about this?

> 
> Btw, we should probably document the reduction in the va map document
> too:

Sure, I will update and post with this patch added, thanks.

> 
> ---
> From: Borislav Petkov <bp@suse.de>
> Date: Thu, 5 Jan 2017 21:47:18 +0100
> Subject: [PATCH] x86/mm: Document modules space reduction
> 
> KASLR reduces module mapping space to 1G, document that.
> 
> Signed-off-by: Borislav Petkov <bp@suse.de>
> ---
>  Documentation/x86/x86_64/mm.txt | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
> index 5724092db811..a737dfbc198b 100644
> --- a/Documentation/x86/x86_64/mm.txt
> +++ b/Documentation/x86/x86_64/mm.txt
> @@ -39,6 +39,8 @@ memory window (this size is arbitrary, it can be raised later if needed).
>  The mappings are not part of any other kernel PGD and are only available
>  during EFI runtime calls.
>  
> +CONFIG_RANDOMIZE_BASE (KASLR) reduces module mapping space from 1.5G to 1G.
> +
>  Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all
>  physical memory, vmalloc/ioremap space and virtual memory map are randomized.
>  Their order is preserved but their base will be offset early at boot time.
> -- 
> 2.11.0
> 
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
> -- 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-01-06  9:36 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-04  8:37 [PATCH v3 0/3] Determine kernel image mapping size at runtime for x86_64 Baoquan He
2017-01-04  8:37 ` [PATCH v3 1/3] x86/64: Make kernel text mapping always take one whole page table in early boot code Baoquan He
2017-01-04 13:00   ` Boris Petkov
2017-01-05  3:28     ` Baoquan He
2017-01-05 14:01       ` Borislav Petkov
2017-01-05 19:35         ` Kees Cook
2017-01-05 20:52           ` Borislav Petkov
2017-01-06  9:35             ` Baoquan He
2017-01-04  8:37 ` [PATCH v3 2/3] x86/64: Introduce a new constant KERNEL_MAPPING_SIZE Baoquan He
2017-01-04  8:37 ` [PATCH v3 3/3] x86/64/KASLR: Determine the kernel mapping size at run time Baoquan He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).