linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] Determine kernel text mapping size at runtime for x86_64
@ 2016-12-09 14:41 Baoquan He
  2016-12-09 14:41 ` [PATCH v2 1/2] x86/64: Make kernel text mapping always take one whole page table in early boot code Baoquan He
  2016-12-09 14:41 ` [PATCH v2 2/2] x86/KASLR/64: Determine kernel text mapping size at runtime Baoquan He
  0 siblings, 2 replies; 11+ messages in thread
From: Baoquan He @ 2016-12-09 14:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: tglx, hpa, mingo, x86, keescook, yinghai, bp, thgarnie,
	kuleshovmail, luto, mcgrof, anderson, dyoung, xlpang, Baoquan He

The current kernel sets KERNEL_IMAGE_SIZE as 1G as long as CONFIG_RANDOMIZE_BASE
is enabled, though people specify "nokaslr" into cmdline to disable kaslr
explicitly. This could be a wrong behaviour. CONFIG_RANDOMIZE_BASE should only
decide if KASLR code need be compiled in. If user specify "nokaslr", kernel should
should behave as no KASLR code compiled in at all.

So in this patchset, made changes to determine the size of kernel text mapping
area at runtime. If "nokaslr" specified, kernel mapping size is 512M though
CONFIG_RANDOMIZE_BASE is enabled.


Baoquan He (2):
  x86/64: Make kernel text mapping always take one whole page table in
    early boot code
  x86/KASLR/64: Determine kernel text mapping size at runtime

 arch/x86/boot/compressed/kaslr.c        | 20 +++++++++++++++-----
 arch/x86/include/asm/kaslr.h            |  1 +
 arch/x86/include/asm/page_64_types.h    | 20 ++++++++++++--------
 arch/x86/include/asm/pgtable_64_types.h |  2 +-
 arch/x86/kernel/head64.c                | 11 ++++++-----
 arch/x86/kernel/head_64.S               | 16 +++++++++-------
 arch/x86/mm/dump_pagetables.c           |  3 ++-
 arch/x86/mm/init_64.c                   |  2 +-
 arch/x86/mm/physaddr.c                  |  6 +++---
 9 files changed, 50 insertions(+), 31 deletions(-)

-- 
2.5.5

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 1/2] x86/64: Make kernel text mapping always take one whole page table in early boot code
  2016-12-09 14:41 [PATCH v2 0/2] Determine kernel text mapping size at runtime for x86_64 Baoquan He
@ 2016-12-09 14:41 ` Baoquan He
  2016-12-09 14:41 ` [PATCH v2 2/2] x86/KASLR/64: Determine kernel text mapping size at runtime Baoquan He
  1 sibling, 0 replies; 11+ messages in thread
From: Baoquan He @ 2016-12-09 14:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: tglx, hpa, mingo, x86, keescook, yinghai, bp, thgarnie,
	kuleshovmail, luto, mcgrof, anderson, dyoung, xlpang, Baoquan He

In early boot code level2_kernel_pgt is used to map kernel text. And its
size varies according to KERNEL_IMAGE_SIZE and fixed at compiling time.
In fact we can make it always take 512 entries of one whole page table,
because later function cleanup_highmap will clean up the unused entries.
With the help of this change kernel text mapping size can be decided at
runtime later, 512M if kaslr is disabled, 1G if kaslr is enabled.

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
---
v1->v2:
    Changed a typo of patch log Alexnader found.

 arch/x86/include/asm/page_64_types.h |  3 ++-
 arch/x86/kernel/head_64.S            | 15 ++++++++-------
 arch/x86/mm/init_64.c                |  2 +-
 3 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index 9215e05..62a20ea 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -56,8 +56,9 @@
  * are fully set up. If kernel ASLR is configured, it can extend the
  * kernel page table mapping, reducing the size of the modules area.
  */
+#define KERNEL_MAPPING_SIZE_EXT	(1024 * 1024 * 1024)
 #if defined(CONFIG_RANDOMIZE_BASE)
-#define KERNEL_IMAGE_SIZE	(1024 * 1024 * 1024)
+#define KERNEL_IMAGE_SIZE	KERNEL_MAPPING_SIZE_EXT
 #else
 #define KERNEL_IMAGE_SIZE	(512 * 1024 * 1024)
 #endif
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index b4421cc..c4b40e7c9 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -453,17 +453,18 @@ NEXT_PAGE(level3_kernel_pgt)
 
 NEXT_PAGE(level2_kernel_pgt)
 	/*
-	 * 512 MB kernel mapping. We spend a full page on this pagetable
-	 * anyway.
+	 * Kernel image size is limited to 512 MB. The kernel code+data+bss
+	 * must not be bigger than that.
 	 *
-	 * The kernel code+data+bss must not be bigger than that.
+	 * We spend a full page on this pagetable anyway, so take the whole
+	 * page here so that the kernel mapping size can be decided at runtime,
+	 * 512M if no kaslr, 1G if kaslr enabled. Later cleanup_highmap will
+	 * clean up those unused entries.
 	 *
-	 * (NOTE: at +512MB starts the module area, see MODULES_VADDR.
-	 *  If you want to increase this then increase MODULES_VADDR
-	 *  too.)
+	 * The module area starts after kernel mapping area.
 	 */
 	PMDS(0, __PAGE_KERNEL_LARGE_EXEC,
-		KERNEL_IMAGE_SIZE/PMD_SIZE)
+		PTRS_PER_PMD)
 
 NEXT_PAGE(level2_fixmap_pgt)
 	.fill	506,8,0
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 14b9dd7..e95b977 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -307,7 +307,7 @@ void __init init_extra_mapping_uc(unsigned long phys, unsigned long size)
 void __init cleanup_highmap(void)
 {
 	unsigned long vaddr = __START_KERNEL_map;
-	unsigned long vaddr_end = __START_KERNEL_map + KERNEL_IMAGE_SIZE;
+	unsigned long vaddr_end = __START_KERNEL_map + KERNEL_MAPPING_SIZE_EXT;
 	unsigned long end = roundup((unsigned long)_brk_end, PMD_SIZE) - 1;
 	pmd_t *pmd = level2_kernel_pgt;
 
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2 2/2] x86/KASLR/64: Determine kernel text mapping size at runtime
  2016-12-09 14:41 [PATCH v2 0/2] Determine kernel text mapping size at runtime for x86_64 Baoquan He
  2016-12-09 14:41 ` [PATCH v2 1/2] x86/64: Make kernel text mapping always take one whole page table in early boot code Baoquan He
@ 2016-12-09 14:41 ` Baoquan He
  2016-12-10 10:31   ` Borislav Petkov
  1 sibling, 1 reply; 11+ messages in thread
From: Baoquan He @ 2016-12-09 14:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: tglx, hpa, mingo, x86, keescook, yinghai, bp, thgarnie,
	kuleshovmail, luto, mcgrof, anderson, dyoung, xlpang, Baoquan He

X86 64 kernel takes KERNEL_IMAGE_SIZE as the kernel text mapping size,
and it's fixed as compiling time, changing from 512M to 1G as long as
CONFIG_RANDOMIZE_BASE is enabled, though people specify kernel option
"nokaslr" explicitly.

This could be a wrong behaviour. CONFIG_RANDOMIZE_BASE should only decide
if the KASLR code need be compiled in. If user specify "nokaslr", the
kernel should behave as no KASLR code compiled in at all.

So in this patch, define a new MACRO KERNEL_MAPPING_SIZE to represent the
size of kernel text mapping area, and let KERNEL_IMAGE_SIZE limit the size
of kernel runtime space. And change to determine the size of kernel text
mapping area at runtime. Though KASLR code compiled in, if "nokaslr"
specified, still set kernel mapping size to be 512M.

Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
---
v1->v2:
  - Move declaration of kernel_mapping_size out of #ifdef CONFIG_RANDOMIZE_MEMORY
    scope of <asm/kaslr.h> to remove compiling error if CONFIG_RANDOMIZE_MEMORY is
    disabled.

  - Change to define _kernel_mapping_size as a static variable inside
    boot/compressed/kaslr.c file.

 arch/x86/boot/compressed/kaslr.c        | 20 +++++++++++++++-----
 arch/x86/include/asm/kaslr.h            |  1 +
 arch/x86/include/asm/page_64_types.h    | 19 +++++++++++--------
 arch/x86/include/asm/pgtable_64_types.h |  2 +-
 arch/x86/kernel/head64.c                | 11 ++++++-----
 arch/x86/kernel/head_64.S               |  3 ++-
 arch/x86/mm/dump_pagetables.c           |  3 ++-
 arch/x86/mm/physaddr.c                  |  6 +++---
 8 files changed, 41 insertions(+), 24 deletions(-)

diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c
index a66854d..823f294 100644
--- a/arch/x86/boot/compressed/kaslr.c
+++ b/arch/x86/boot/compressed/kaslr.c
@@ -22,6 +22,12 @@
 static const char build_str[] = UTS_RELEASE " (" LINUX_COMPILE_BY "@"
 		LINUX_COMPILE_HOST ") (" LINUX_COMPILER ") " UTS_VERSION;
 
+/*
+ * By default, the size of kernel text mapping equals KERNEL_IMAGE_SIZE.
+ * While x86_64 may extend it to 1G if KASLR is enabled.
+ */
+static unsigned long _kernel_mapping_size = KERNEL_IMAGE_SIZE;
+
 static unsigned long rotate_xor(unsigned long hash, const void *area,
 				size_t size)
 {
@@ -311,7 +317,7 @@ static void process_e820_entry(struct e820entry *entry,
 		return;
 
 	/* On 32-bit, ignore entries entirely above our maximum. */
-	if (IS_ENABLED(CONFIG_X86_32) && entry->addr >= KERNEL_IMAGE_SIZE)
+	if (IS_ENABLED(CONFIG_X86_32) && entry->addr >= _kernel_mapping_size)
 		return;
 
 	/* Ignore entries entirely below our minimum. */
@@ -341,8 +347,8 @@ static void process_e820_entry(struct e820entry *entry,
 
 		/* On 32-bit, reduce region size to fit within max size. */
 		if (IS_ENABLED(CONFIG_X86_32) &&
-		    region.start + region.size > KERNEL_IMAGE_SIZE)
-			region.size = KERNEL_IMAGE_SIZE - region.start;
+		    region.start + region.size > _kernel_mapping_size)
+			region.size = _kernel_mapping_size - region.start;
 
 		/* Return if region can't contain decompressed kernel */
 		if (region.size < image_size)
@@ -408,9 +414,9 @@ static unsigned long find_random_virt_addr(unsigned long minimum,
 	/*
 	 * There are how many CONFIG_PHYSICAL_ALIGN-sized slots
 	 * that can hold image_size within the range of minimum to
-	 * KERNEL_IMAGE_SIZE?
+	 * _kernel_mapping_size?
 	 */
-	slots = (KERNEL_IMAGE_SIZE - minimum - image_size) /
+	slots = (_kernel_mapping_size - minimum - image_size) /
 		 CONFIG_PHYSICAL_ALIGN + 1;
 
 	random_addr = kaslr_get_random_long("Virtual") % slots;
@@ -438,6 +444,10 @@ void choose_random_location(unsigned long input,
 		return;
 	}
 
+#ifdef CONFIG_X86_64
+	_kernel_mapping_size = KERNEL_MAPPING_SIZE_EXT;
+#endif
+
 	boot_params->hdr.loadflags |= KASLR_FLAG;
 
 	/* Prepare to add new identity pagetables on demand. */
diff --git a/arch/x86/include/asm/kaslr.h b/arch/x86/include/asm/kaslr.h
index 1052a79..093935d 100644
--- a/arch/x86/include/asm/kaslr.h
+++ b/arch/x86/include/asm/kaslr.h
@@ -2,6 +2,7 @@
 #define _ASM_KASLR_H_
 
 unsigned long kaslr_get_random_long(const char *purpose);
+extern unsigned long kernel_mapping_size;
 
 #ifdef CONFIG_RANDOMIZE_MEMORY
 extern unsigned long page_offset_base;
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index 62a20ea..b8e79d7 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -49,18 +49,21 @@
 #define __PHYSICAL_MASK_SHIFT	46
 #define __VIRTUAL_MASK_SHIFT	47
 
+
+/*
+ * Kernel image size is limited to 512 MB. The kernel code+data+bss
+ * must not be bigger than that.
+ */
+#define KERNEL_IMAGE_SIZE	(512 * 1024 * 1024)
+
 /*
- * Kernel image size is limited to 1GiB due to the fixmap living in the
- * next 1GiB (see level2_kernel_pgt in arch/x86/kernel/head_64.S). Use
- * 512MiB by default, leaving 1.5GiB for modules once the page tables
+ * Kernel mapping size is limited to 1GiB due to the fixmap living in
+ * the next 1GiB (see level2_kernel_pgt in arch/x86/kernel/head_64.S).
+ * Use 512MiB by default, leaving 1.5GiB for modules once the page tables
  * are fully set up. If kernel ASLR is configured, it can extend the
  * kernel page table mapping, reducing the size of the modules area.
  */
 #define KERNEL_MAPPING_SIZE_EXT	(1024 * 1024 * 1024)
-#if defined(CONFIG_RANDOMIZE_BASE)
-#define KERNEL_IMAGE_SIZE	KERNEL_MAPPING_SIZE_EXT
-#else
-#define KERNEL_IMAGE_SIZE	(512 * 1024 * 1024)
-#endif
+#define KERNEL_MAPPING_SIZE	kernel_mapping_size
 
 #endif /* _ASM_X86_PAGE_64_DEFS_H */
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 3a26420..a357050 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -66,7 +66,7 @@ typedef struct { pteval_t pte; } pte_t;
 #define VMEMMAP_START	__VMEMMAP_BASE
 #endif /* CONFIG_RANDOMIZE_MEMORY */
 #define VMALLOC_END	(VMALLOC_START + _AC((VMALLOC_SIZE_TB << 40) - 1, UL))
-#define MODULES_VADDR    (__START_KERNEL_map + KERNEL_IMAGE_SIZE)
+#define MODULES_VADDR    (__START_KERNEL_map + KERNEL_MAPPING_SIZE)
 #define MODULES_END      _AC(0xffffffffff000000, UL)
 #define MODULES_LEN   (MODULES_END - MODULES_VADDR)
 #define ESPFIX_PGD_ENTRY _AC(-2, UL)
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 54a2372..46d2bd2 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -28,6 +28,7 @@
 #include <asm/bootparam_utils.h>
 #include <asm/microcode.h>
 #include <asm/kasan.h>
+#include <asm/cmdline.h>
 
 /*
  * Manage page tables very early on.
@@ -36,6 +37,7 @@ extern pgd_t early_level4_pgt[PTRS_PER_PGD];
 extern pmd_t early_dynamic_pgts[EARLY_DYNAMIC_PAGE_TABLES][PTRS_PER_PMD];
 static unsigned int __initdata next_early_pgt = 2;
 pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_NX);
+unsigned long kernel_mapping_size = KERNEL_IMAGE_SIZE;
 
 /* Wipe all early page tables except for the kernel symbol map */
 static void __init reset_early_page_tables(void)
@@ -138,12 +140,7 @@ asmlinkage __visible void __init x86_64_start_kernel(char * real_mode_data)
 	 * Build-time sanity checks on the kernel image and module
 	 * area mappings. (these are purely build-time and produce no code)
 	 */
-	BUILD_BUG_ON(MODULES_VADDR < __START_KERNEL_map);
-	BUILD_BUG_ON(MODULES_VADDR - __START_KERNEL_map < KERNEL_IMAGE_SIZE);
-	BUILD_BUG_ON(MODULES_LEN + KERNEL_IMAGE_SIZE > 2*PUD_SIZE);
 	BUILD_BUG_ON((__START_KERNEL_map & ~PMD_MASK) != 0);
-	BUILD_BUG_ON((MODULES_VADDR & ~PMD_MASK) != 0);
-	BUILD_BUG_ON(!(MODULES_VADDR > __START_KERNEL));
 	BUILD_BUG_ON(!(((MODULES_END - 1) & PGDIR_MASK) ==
 				(__START_KERNEL & PGDIR_MASK)));
 	BUILD_BUG_ON(__fix_to_virt(__end_of_fixed_addresses) <= MODULES_END);
@@ -165,6 +162,10 @@ asmlinkage __visible void __init x86_64_start_kernel(char * real_mode_data)
 
 	copy_bootdata(__va(real_mode_data));
 
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) &&
+		!cmdline_find_option_bool(boot_command_line, "nokaslr"))
+		kernel_mapping_size = KERNEL_MAPPING_SIZE_EXT;
+
 	/*
 	 * Load microcode early on BSP.
 	 */
diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index c4b40e7c9..8bbb29e 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -461,7 +461,8 @@ NEXT_PAGE(level2_kernel_pgt)
 	 * 512M if no kaslr, 1G if kaslr enabled. Later cleanup_highmap will
 	 * clean up those unused entries.
 	 *
-	 * The module area starts after kernel mapping area.
+	 * The module area starts after kernel mapping area, see MODULES_VADDR.
+	 * It will vary with KERNEL_MAPPING_SIZE.
 	 */
 	PMDS(0, __PAGE_KERNEL_LARGE_EXEC,
 		PTRS_PER_PMD)
diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index ea9c49a..412c3f5 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -82,7 +82,7 @@ static struct addr_marker address_markers[] = {
 	{ EFI_VA_END,		"EFI Runtime Services" },
 # endif
 	{ __START_KERNEL_map,   "High Kernel Mapping" },
-	{ MODULES_VADDR,        "Modules" },
+	{ 0/*MODULES_VADDR*/,        "Modules" },
 	{ MODULES_END,          "End Modules" },
 #else
 	{ PAGE_OFFSET,          "Kernel Mapping" },
@@ -442,6 +442,7 @@ static int __init pt_dump_init(void)
 	address_markers[LOW_KERNEL_NR].start_address = PAGE_OFFSET;
 	address_markers[VMALLOC_START_NR].start_address = VMALLOC_START;
 	address_markers[VMEMMAP_START_NR].start_address = VMEMMAP_START;
+	address_markers[MODULES_VADDR_NR].start_address = MODULES_VADDR;
 #endif
 #ifdef CONFIG_X86_32
 	address_markers[VMALLOC_START_NR].start_address = VMALLOC_START;
diff --git a/arch/x86/mm/physaddr.c b/arch/x86/mm/physaddr.c
index cfc3b91..c0b70fc 100644
--- a/arch/x86/mm/physaddr.c
+++ b/arch/x86/mm/physaddr.c
@@ -18,7 +18,7 @@ unsigned long __phys_addr(unsigned long x)
 	if (unlikely(x > y)) {
 		x = y + phys_base;
 
-		VIRTUAL_BUG_ON(y >= KERNEL_IMAGE_SIZE);
+		VIRTUAL_BUG_ON(y >= KERNEL_MAPPING_SIZE);
 	} else {
 		x = y + (__START_KERNEL_map - PAGE_OFFSET);
 
@@ -35,7 +35,7 @@ unsigned long __phys_addr_symbol(unsigned long x)
 	unsigned long y = x - __START_KERNEL_map;
 
 	/* only check upper bounds since lower bounds will trigger carry */
-	VIRTUAL_BUG_ON(y >= KERNEL_IMAGE_SIZE);
+	VIRTUAL_BUG_ON(y >= KERNEL_MAPPING_SIZE);
 
 	return y + phys_base;
 }
@@ -50,7 +50,7 @@ bool __virt_addr_valid(unsigned long x)
 	if (unlikely(x > y)) {
 		x = y + phys_base;
 
-		if (y >= KERNEL_IMAGE_SIZE)
+		if (y >= KERNEL_MAPPING_SIZE)
 			return false;
 	} else {
 		x = y + (__START_KERNEL_map - PAGE_OFFSET);
-- 
2.5.5

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/2] x86/KASLR/64: Determine kernel text mapping size at runtime
  2016-12-09 14:41 ` [PATCH v2 2/2] x86/KASLR/64: Determine kernel text mapping size at runtime Baoquan He
@ 2016-12-10 10:31   ` Borislav Petkov
  2016-12-10 12:27     ` Baoquan He
  0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2016-12-10 10:31 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, tglx, hpa, mingo, x86, keescook, yinghai, thgarnie,
	kuleshovmail, luto, mcgrof, anderson, dyoung, xlpang

On Fri, Dec 09, 2016 at 10:41:58PM +0800, Baoquan He wrote:
> X86 64 kernel takes KERNEL_IMAGE_SIZE as the kernel text mapping size,
> and it's fixed as compiling time, changing from 512M to 1G as long as
> CONFIG_RANDOMIZE_BASE is enabled, though people specify kernel option
> "nokaslr" explicitly.
> 
> This could be a wrong behaviour.

A bunch of changes just because "this could be a wrong behavior". I'm
not really persuaded.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/2] x86/KASLR/64: Determine kernel text mapping size at runtime
  2016-12-10 10:31   ` Borislav Petkov
@ 2016-12-10 12:27     ` Baoquan He
  2016-12-10 12:33       ` Borislav Petkov
  0 siblings, 1 reply; 11+ messages in thread
From: Baoquan He @ 2016-12-10 12:27 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, tglx, hpa, mingo, x86, keescook, yinghai, thgarnie,
	kuleshovmail, luto, mcgrof, anderson, dyoung, xlpang

On 12/10/16 at 11:31am, Borislav Petkov wrote:
> On Fri, Dec 09, 2016 at 10:41:58PM +0800, Baoquan He wrote:
> > X86 64 kernel takes KERNEL_IMAGE_SIZE as the kernel text mapping size,
> > and it's fixed as compiling time, changing from 512M to 1G as long as
> > CONFIG_RANDOMIZE_BASE is enabled, though people specify kernel option
> > "nokaslr" explicitly.
> > 
> > This could be a wrong behaviour.
> 
> A bunch of changes just because "this could be a wrong behavior". I'm
> not really persuaded.


Well, then apologize for this wrong expression. It should be "This is
a wrong behaviour."

Whether CONFIG_RANDOMIZE_BASE is yes or not, with 'nokaslr' specified,
Kernel text mapping size should be 512M, just the same as no kaslr code
compiled in.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/2] x86/KASLR/64: Determine kernel text mapping size at runtime
  2016-12-10 12:27     ` Baoquan He
@ 2016-12-10 12:33       ` Borislav Petkov
  2016-12-10 13:41         ` Baoquan He
  0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2016-12-10 12:33 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, tglx, hpa, mingo, x86, keescook, yinghai, thgarnie,
	kuleshovmail, luto, mcgrof, anderson, dyoung, xlpang

On Sat, Dec 10, 2016 at 08:27:57PM +0800, Baoquan He wrote:
> Whether CONFIG_RANDOMIZE_BASE is yes or not, with 'nokaslr' specified,
> Kernel text mapping size should be 512M, just the same as no kaslr code
> compiled in.

"should be" still doesn't really explain what the problem is. What's
wrong with it remaining 1G?

IOW, something like "The problem is X and the issues it causes are Y.
That's why we need to do Z."

Now please replace X,Y and Z with the real content.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/2] x86/KASLR/64: Determine kernel text mapping size at runtime
  2016-12-10 12:33       ` Borislav Petkov
@ 2016-12-10 13:41         ` Baoquan He
  2016-12-10 16:28           ` Borislav Petkov
  0 siblings, 1 reply; 11+ messages in thread
From: Baoquan He @ 2016-12-10 13:41 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, tglx, hpa, mingo, x86, keescook, yinghai, thgarnie,
	kuleshovmail, luto, mcgrof, anderson, dyoung, xlpang

On 12/10/16 at 01:33pm, Borislav Petkov wrote:
> On Sat, Dec 10, 2016 at 08:27:57PM +0800, Baoquan He wrote:
> > Whether CONFIG_RANDOMIZE_BASE is yes or not, with 'nokaslr' specified,
> > Kernel text mapping size should be 512M, just the same as no kaslr code
> > compiled in.
> 
> "should be" still doesn't really explain what the problem is. What's
> wrong with it remaining 1G?
> 
> IOW, something like "The problem is X and the issues it causes are Y.
> That's why we need to do Z."
> 
> Now please replace X,Y and Z with the real content.

I made this patchset because of two things:

1) Fedora 25 defaults to enable CONFIG_RANDOMIZE_BASE. And this worries
maintainers of several Fedora component. People ever asked me how to
judge whether it's a kaslr kernel. I told them I usually read elf header
of kcore - "readelf -l /proc/kcore" to check it. If the 'VirtAddr' of
segments like kernel text, modules, direct mapping is changed, it should
be kaslr kernel. Then they said why I have specified 'nokaslr', the
virtual address of modules is not '0xffffffffa0000000', but
'0xffffffffc0000000'. OK, I realized this is not right, it need be
fixed.

2) The second thing I remember Dave said he judged the kaslr kernel by
the value of KERNEL_IMAGE_SIZE. Then I decide this is a wrong behaviour
and I should change it. But in v1 post, Dave denied this. Checking Crash
code, what he has done is like below:

if ((kt->flags2 & KASLR) && (THIS_KERNEL_VERSION >= LINUX(4,7,0)))                                                                               
        machdep->machspec->modules_vaddr = __START_KERNEL_map + 
                (machdep->machspec->kernel_image_size ?
                machdep->machspec->kernel_image_size : GIGABYTES(1));

You can see that if a kaslr kernel, Dave will get modules_vaddr by
KERNEL_IAMGE_SIZE or 1G directly if that value is not exported. However
KERNEL_IMAGE_SIZE is always 1G as long as CONFIG_RANDOMIZE_BASE is set
to 'y', whether kaslr is enabled or not. As you said, in this case,
remaining 1G doesn't impact things.

So in v2 I didn't mention problem about Crash. But case 1) need be
cared, whether kaslr code is compiled or not, it should not confuse
people, should not make difference between kaslr code not compiled in
and kaslr code compiled in with 'nokaslr' specified. Now the thing is
I am wondering if confusing people is a problem. Except of this I
haven't get report that it impacts things and caused problem.

Usually in redhat we have a convention that when we fix a bug, we
should write patch log like this:
what: what problem you have met.
why:  why does it happen, what is the root cause you got from analysis.
how   How do you fix it in this patch.

I personally think the 'what' is 'Y' you mentioned, and 'why' is 'X'.
Whatever it is, it's good if people can easily understand what you say.
In this patch log, since the problem is so obvious, I mean the
confusing difference, when I descirbed the proble, the root cause has
been told too. 

So I would like to adjust the log as you suggested, does it please you?
____________________________________________________________________
X86 64 kernel takes KERNEL_IMAGE_SIZE as the kernel text mapping size,
and it's fixed as compiling time, changing from 512M to 1G as long as
CONFIG_RANDOMIZE_BASE is enabled, though people specify kernel option
"nokaslr" explicitly. This is really confusing people when check if it's
a kaslr kernel, E.g checking the outout of 'readelf -l /proc/kcore'.

This is obviously a wrong behaviour. CONFIG_RANDOMIZE_BASE should only
decide if the KASLR code need be compiled in. If user specify "nokaslr",
the kernel should behave as no KASLR code compiled in at all.

So in this patch, define a new MACRO KERNEL_MAPPING_SIZE to represent
the size of kernel text mapping area, and let KERNEL_IMAGE_SIZE limit
the size of kernel runtime space. And change to determine the size of
kernel text mapping area at runtime. Though KASLR code compiled in, if
"nokaslr" specified, still set kernel mapping size to be 512M.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/2] x86/KASLR/64: Determine kernel text mapping size at runtime
  2016-12-10 13:41         ` Baoquan He
@ 2016-12-10 16:28           ` Borislav Petkov
  2016-12-11 10:58             ` Baoquan He
  0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2016-12-10 16:28 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, tglx, hpa, mingo, x86, keescook, yinghai, thgarnie,
	kuleshovmail, luto, mcgrof, anderson, dyoung, xlpang

On Sat, Dec 10, 2016 at 09:41:56PM +0800, Baoquan He wrote:
> 1) Fedora 25 defaults to enable CONFIG_RANDOMIZE_BASE. And this worries
> maintainers of several Fedora component. People ever asked me how to
> judge whether it's a kaslr kernel. I told them I usually read elf header
> of kcore - "readelf -l /proc/kcore" to check it. If the 'VirtAddr' of
> segments like kernel text, modules, direct mapping is changed, it should
> be kaslr kernel. Then they said why I have specified 'nokaslr', the
> virtual address of modules is not '0xffffffffa0000000', but
> '0xffffffffc0000000'. OK, I realized this is not right, it need be
> fixed.

So people want to know whether the kernel they're running has KASLR
enabled or not.

Clearly they can grep their config. And then check whether "nokaslr" has
been added to the kernel command line or not. Done.

> So in v2 I didn't mention problem about Crash. But case 1) need be
> cared, whether kaslr code is compiled or not, it should not confuse
> people, should not make difference between kaslr code not compiled in
> and kaslr code compiled in with 'nokaslr' specified.

That's exactly the point - people should *not* care whether it is a
kernel with KASLR enabled or not - stuff should just work. So what
you're trying to "fix" here is an exercise of pointlessness, IMO. Unless
you give me a real, valid reason why people need a *defined* interface
to ask whether the kernel has KASLR enabled or not.

And even then, looking at KERNEL_IMAGE_SIZE is still the wrong way to do
it.

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/2] x86/KASLR/64: Determine kernel text mapping size at runtime
  2016-12-10 16:28           ` Borislav Petkov
@ 2016-12-11 10:58             ` Baoquan He
  2016-12-11 12:06               ` Borislav Petkov
  0 siblings, 1 reply; 11+ messages in thread
From: Baoquan He @ 2016-12-11 10:58 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, tglx, hpa, mingo, x86, keescook, yinghai, thgarnie,
	kuleshovmail, luto, mcgrof, anderson, dyoung, xlpang

On 12/10/16 at 05:28pm, Borislav Petkov wrote:
> On Sat, Dec 10, 2016 at 09:41:56PM +0800, Baoquan He wrote:
> > 1) Fedora 25 defaults to enable CONFIG_RANDOMIZE_BASE. And this worries
> > maintainers of several Fedora component. People ever asked me how to
> > judge whether it's a kaslr kernel. I told them I usually read elf header
> > of kcore - "readelf -l /proc/kcore" to check it. If the 'VirtAddr' of
> > segments like kernel text, modules, direct mapping is changed, it should
> > be kaslr kernel. Then they said why I have specified 'nokaslr', the
> > virtual address of modules is not '0xffffffffa0000000', but
> > '0xffffffffc0000000'. OK, I realized this is not right, it need be
> > fixed.
> 
> So people want to know whether the kernel they're running has KASLR
> enabled or not.
> 
> Clearly they can grep their config. And then check whether "nokaslr" has
> been added to the kernel command line or not. Done.

Surely they can grep their config and then check whether "nokaslr" has
been added to kernel cmdline. But could you tell where I can find
document or descriptions defining if I want to know a feature is
enabled or not, I have to grep their config and then check kernel
cmdline? Linux kernel is a open source and free software, as long as
people get and use it, they can do whatever they want to do. If they
want to know somethings, they can look anywhere. They can look at config
or not, they can look at this or that, they can look alone or together
with people, habited or naked. In my humble opinion, people can check
whatever they like, while kernel developers need to make sure wherever
and whenever they check, things should be correct, stable, reasonable,
consistent. If there's corner case or exceptions hard to cover, at least
leave a note to explain it.

So now, what you are telling is if someone saw something not reasonable,
blame on him, because he is looking at something that supposed not to be
seen. I am not persuaded.

> 
> > So in v2 I didn't mention problem about Crash. But case 1) need be
> > cared, whether kaslr code is compiled or not, it should not confuse
> > people, should not make difference between kaslr code not compiled in
> > and kaslr code compiled in with 'nokaslr' specified.
> 
> That's exactly the point - people should *not* care whether it is a
> kernel with KASLR enabled or not - stuff should just work. So what
> you're trying to "fix" here is an exercise of pointlessness, IMO. Unless
> you give me a real, valid reason why people need a *defined* interface
> to ask whether the kernel has KASLR enabled or not.

Well, assume a kernel driver developer wants to check if his code still
works well under Fedora 25 which defaults to enable
CONFIG_RANDOMIZE_BASE, for example. Firstly he surely want to know if
his code still works with kaslr feature disabled which is the same as in
the old version of distro kaslr feature is not compiled in or not
supported. Now two choices are put in front of him:
1) Set CONFIG_RANDOMIZE_BASE to 'n' and rebuild kernel code, it will
take him 30 minutes or about 1 hour.
2) Add "nokaslr" to kernel cmdline and rebot, it will take him 30
seconds or about 1 minute.

If were you, which one do you choose?

Then he added his testing code to print the virtual address his modules are
loaded at. Now he saw his driver modules were not loaded at the range he
always saw in the previous kernel, but in a position with 512M added. In
this case, what do you suggest him to do? Still stuff should just work, leave
it alone? If he really does as you suggested, do you think if he is a
responsible worker? If you are his boss, are you happy to see what he is
doing? 

So it's so obvious that "nokaslr" should disable kaslr feature completely,
should be consistent with the old kernel which doesn't support kaslr at
all. However now it has inconsistent behaviour, though no kernel crash is
reported or userspace utility collapse collected, if saw a plate filled
with food sliding to the edge of table, half hovering, we have to wait
until it fall to the floor, then busy on fixing plate, scrubbing off
the dirty, finding people to blame?

For arguing and defending myself, I couldn't be very objective. So if
you really think remaining 1G is OK though nokaslr is specified, I am
fine with it. We can wait until a report is coming.
> 
> And even then, looking at KERNEL_IMAGE_SIZE is still the wrong way to do
> it.

Again, any information, as long as it's exposed out of kernel, can be
checked. KERNEL_IMAGE_SIZE should be used to limit the run space of
kernel, if someone is doubted using it in wrong way, can we check
ourselves if we define and expose it right.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/2] x86/KASLR/64: Determine kernel text mapping size at runtime
  2016-12-11 10:58             ` Baoquan He
@ 2016-12-11 12:06               ` Borislav Petkov
  2016-12-12  2:32                 ` Baoquan He
  0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2016-12-11 12:06 UTC (permalink / raw)
  To: Baoquan He
  Cc: linux-kernel, tglx, hpa, mingo, x86, keescook, yinghai, thgarnie,
	kuleshovmail, luto, mcgrof, anderson, dyoung, xlpang

On Sun, Dec 11, 2016 at 06:58:29PM +0800, Baoquan He wrote:
> For arguing and defending myself, I couldn't be very objective.

Yeah, it is mind-boggling the amount of bullshit you would come up with
instead of simply saying, "no, I don't have a good reason and use case
for my patch". It made me laugh, FWIW. Especially the bit about people
getting naked - I had to go check we're still talking about the same
thing.

:-)

-- 
Regards/Gruss,
    Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
-- 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/2] x86/KASLR/64: Determine kernel text mapping size at runtime
  2016-12-11 12:06               ` Borislav Petkov
@ 2016-12-12  2:32                 ` Baoquan He
  0 siblings, 0 replies; 11+ messages in thread
From: Baoquan He @ 2016-12-12  2:32 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-kernel, tglx, hpa, mingo, x86, keescook, yinghai, thgarnie,
	kuleshovmail, luto, mcgrof, anderson, dyoung, xlpang

On 12/11/16 at 01:06pm, Borislav Petkov wrote:
> On Sun, Dec 11, 2016 at 06:58:29PM +0800, Baoquan He wrote:
> > For arguing and defending myself, I couldn't be very objective.
> 
> Yeah, it is mind-boggling the amount of bullshit you would come up with
> instead of simply saying, "no, I don't have a good reason and use case
> for my patch". It made me laugh, FWIW. Especially the bit about people
> getting naked - I had to go check we're still talking about the same
> thing.

Yes, I can't agree more, that use case is totally of bullshit. At the
very beginning, we all know that this patch is trying to fix the
inconsistency between kaslr codes not compiled in and code compiled in
but with "nokaslr" specified. In short, this patch is fixing an
inconsistency, no bug is reported yet. Here the inconsistency is the
reason for this patch. I think it has been made very clearly now. This
also has been pointed out by Kees when he offered his "Acked-by". I
welcome and treat all comments seriously, no other choices are given
to me.


If at the start, you said straightforwardly like:

"No bug, no fix!"

"A little inconsistency makes the world more exciting, it can make me
high."

or
"We can leave with it until a bug is reported, remaining 1G is no harm."

I can accept it totally and mute. But I didn't hear them. As an expert of
x86 arch and authority, you honor me to step in and give comments, I
have to reply with respect.

I am very glad to see you said you laughed at something, whatever it is
for, at least it means thing is not screwed up thoroughly, laughter is
always good.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-12-12  2:32 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-09 14:41 [PATCH v2 0/2] Determine kernel text mapping size at runtime for x86_64 Baoquan He
2016-12-09 14:41 ` [PATCH v2 1/2] x86/64: Make kernel text mapping always take one whole page table in early boot code Baoquan He
2016-12-09 14:41 ` [PATCH v2 2/2] x86/KASLR/64: Determine kernel text mapping size at runtime Baoquan He
2016-12-10 10:31   ` Borislav Petkov
2016-12-10 12:27     ` Baoquan He
2016-12-10 12:33       ` Borislav Petkov
2016-12-10 13:41         ` Baoquan He
2016-12-10 16:28           ` Borislav Petkov
2016-12-11 10:58             ` Baoquan He
2016-12-11 12:06               ` Borislav Petkov
2016-12-12  2:32                 ` Baoquan He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).