linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH] Randomize kernel base address on boot
@ 2011-05-24 20:31 Dan Rosenberg
  2011-05-24 21:02 ` Ingo Molnar
                   ` (8 more replies)
  0 siblings, 9 replies; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-24 20:31 UTC (permalink / raw)
  To: Dan Rosenberg, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, Ingo Molnar, pageexec

This introduces CONFIG_RANDOMIZE_BASE, which randomizes the address at
which the kernel is decompressed at boot as a security feature that
deters exploit attempts relying on knowledge of the location of kernel
internals.  The default values of the kptr_restrict and dmesg_restrict
sysctls are set to (1) when this is enabled, since hiding kernel
pointers is necessary to preserve the secrecy of the randomized base
address.

This feature also uses a fixed mapping to move the IDT (if not already
done as a fix for the F00F bug), to avoid exposing the location of
kernel internals relative to the original IDT.  This has the additional
security benefit of marking the new virtual address of the IDT
read-only.

Entropy is generated using the RDRAND instruction if it is supported. If
not, then RDTSC is used, if supported. If neither RDRAND nor RDTSC are
supported, then no randomness is introduced. Support for the CPUID
instruction is required to check for the availability of these two
instructions.

Thanks to everyone who contributed helpful suggestions and feedback so
far.

Comments/Questions:

* Since RDRAND is relatively new, only the most recent version of
binutils supports assembling it.  To avoid breaking builds for people
who use older toolchains but want this feature, I hardcoded the opcodes.
If anyone has a better approach, please let me know.

* I chose to mimic the F00F bugfix behavior for moving the IDT, since it
required very little code and has the additional benefit of making the
IDT read-only. Ingo Molnar's suggestion of allocating per-cpu IDTs
instead is still on the table, and I'd like to get feedback on this.

* In order to increase the entropy for the randomized base, I changed
the default value of CONFIG_PHYSICAL_ALIGN back to 2mb.  It had
previously been raised to 16mb as a hack so that relocatable kernels
wouldn't load below that minimum.  I address this by changing the
meaning of CONFIG_PHYSICAL_START such that it now represents a minimum
address that relocatable kernels can be loaded at (rather than being
ignored by relocatable kernels).  So, if a relocatable kernel determines
it should be loaded at an address below CONFIG_PHYSICAL_START (which
defaults to 16mb), I just bump it up.

* I would appreciate guidance on safe values for the highest addresses
we can safely load the kernel at, on both 32-bit and 64-bit. This
version uses 64mb (0x4000000) for 32-bit, and worked well in testing.

* CONFIG_RANDOMIZE_BASE automatically sets the default value of
kptr_restrict and dmesg_restrict to 1, since it's nonsensical to use
this without the other two.  I considered removing
CONFIG_SECURITY_DMESG_RESTRICT altogether (it currently sets the default
value for dmesg_restrict), but just in case distros want to keep the
CONFIG as a toggle switch but don't want to use CONFIG_RANDOMIZE_BASE, I
kept it around.  So, now CONFIG_RANDOMIZE_BASE sets the default value
for CONFIG_SECURITY_DMESG_RESTRICT.

* x86-64 is still "to-do". Because it calculates the kernel text address
twice, this may be a little trickier.

* Finding a middle ground instead of the current "all-or-nothing"
behavior of kptr_restrict that allows perf users to use this feature is
future work.

* Tested by repeatedly booting and observing kallsyms output on both
i386.  Passed the "looks random to me" test, and saw no bad behavior.
Tested that changing CONFIG_PHYSICAL_ALIGN to 2mb still boots and runs
fine on amd64.

* Is it worth bothering to look for alternate sources of entropy if
RDTSC isn't available?

* Could use testing of CPU hotplugging and suspend/resume.

Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
---
 Documentation/sysctl/kernel.txt    |   13 ++++---
 arch/x86/Kconfig                   |   32 ++++++++++++++++--
 arch/x86/boot/compressed/head_32.S |   63 ++++++++++++++++++++++++++++++++++++
 arch/x86/boot/compressed/head_64.S |   16 ++++++++-
 arch/x86/include/asm/fixmap.h      |    4 ++
 arch/x86/kernel/traps.c            |    7 ++++
 kernel/printk.c                    |    4 +-
 lib/vsprintf.c                     |    4 ++
 security/Kconfig                   |    2 +-
 9 files changed, 132 insertions(+), 13 deletions(-)

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index 36f0075..ed91ae3 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -267,11 +267,14 @@ kptr_restrict:
 This toggle indicates whether restrictions are placed on
 exposing kernel addresses via /proc and other interfaces.  When
 kptr_restrict is set to (0), there are no restrictions.  When
-kptr_restrict is set to (1), the default, kernel pointers
-printed using the %pK format specifier will be replaced with 0's
-unless the user has CAP_SYSLOG.  When kptr_restrict is set to
-(2), kernel pointers printed using %pK will be replaced with 0's
-regardless of privileges.
+kptr_restrict is set to (1), kernel pointers printed using the
+%pK format specifier will be replaced with 0's unless the user
+has CAP_SYSLOG.  When kptr_restrict is set to (2), kernel
+pointers printed using %pK will be replaced with 0's regardless
+of privileges.
+
+Enabling the CONFIG_RANDOMIZE_BASE kernel config sets the default
+kptr_restrict value to (1).  Otherwise, the default is (0).
 
 ==============================================================
 
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 880fcb6..999ea82 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1548,8 +1548,8 @@ config PHYSICAL_START
 	  If kernel is a not relocatable (CONFIG_RELOCATABLE=n) then
 	  bzImage will decompress itself to above physical address and
 	  run from there. Otherwise, bzImage will run from the address where
-	  it has been loaded by the boot loader and will ignore above physical
-	  address.
+	  it has been loaded by the boot loader, using the above physical
+	  address as a lower bound.
 
 	  In normal kdump cases one does not have to set/change this option
 	  as now bzImage can be compiled as a completely relocatable image
@@ -1595,7 +1595,31 @@ config RELOCATABLE
 
 	  Note: If CONFIG_RELOCATABLE=y, then the kernel runs from the address
 	  it has been loaded at and the compile time physical address
-	  (CONFIG_PHYSICAL_START) is ignored.
+	  (CONFIG_PHYSICAL_START) is solely used as a lower bound.
+
+config RANDOMIZE_BASE
+	bool "Randomize the address of the kernel image"
+	depends on X86_32 && RELOCATABLE
+	default n
+	---help---
+	  Randomizes the address at which the kernel image is decompressed, as
+	  a security feature that deters exploit attempts relying on knowledge
+	  of the location of kernel internals. The default values of the
+	  kptr_restrict and dmesg_restrict sysctls are set to (1) when this is
+	  enabled, since hiding kernel pointers is necessary to preserve the
+	  secrecy of the randomized base address.
+
+	  This feature also uses a fixed mapping to move the IDT (if not
+	  already done as a fix for the F00F bug), to avoid exposing the
+	  location of kernel internals relative to the original IDT. This has
+	  the additional security benefit of marking the new virtual address of
+	  the IDT read-only.
+
+	  Entropy is generated using the RDRAND instruction if it is supported.
+	  If not, then RDTSC is used, if supported. If neither RDRAND nor RDTSC
+	  are supported, then no randomness is introduced. Support for the
+	  CPUID instruction is required to check for the availability of these
+	  two instructions.
 
 # Relocation on x86-32 needs some additional build support
 config X86_NEED_RELOCS
@@ -1604,7 +1628,7 @@ config X86_NEED_RELOCS
 
 config PHYSICAL_ALIGN
 	hex "Alignment value to which kernel should be aligned" if X86_32
-	default "0x1000000"
+	default "0x200000"
 	range 0x2000 0x1000000
 	---help---
 	  This value puts the alignment restrictions on physical address
diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
index 67a655a..2680db0 100644
--- a/arch/x86/boot/compressed/head_32.S
+++ b/arch/x86/boot/compressed/head_32.S
@@ -69,12 +69,75 @@ ENTRY(startup_32)
  */
 
 #ifdef CONFIG_RELOCATABLE
+#ifdef CONFIG_RANDOMIZE_BASE
+
+	/* Standard check for cpuid */
+	pushfl
+	popl	%eax
+	movl	%eax, %ebx
+	xorl	$0x200000, %eax
+	pushl	%eax
+	popfl
+	pushfl
+	popl	%eax
+	cmpl	%eax, %ebx
+	jz	4f
+
+	/* Check for cpuid 1 */
+	movl	$0x0, %eax
+	cpuid
+	cmpl	$0x1, %eax
+	jb	4f
+
+	movl	$0x1, %eax
+	cpuid
+	xor	%eax, %eax
+
+	/* RDRAND is bit 30 */
+	testl	$0x4000000, %ecx
+	jnz	1f
+
+	/* RDTSC is bit 4 */
+	testl	$0x10, %edx
+	jnz	3f
+
+	/* Nothing is supported */
+	jmp	4f
+1:
+	/* RDRAND sets carry bit on success, otherwise we should try
+	 * again. */
+	movl	$0x10, %ecx
+2:
+	/* rdrand %eax */
+	.byte	0x0f, 0xc7, 0xf0
+	jc	4f
+	loop	2b
+
+	/* Fall through: if RDRAND is supported but fails, use RDTSC,
+	 * which is guaranteed to be supported. */
+3:
+	rdtsc
+	shll	$0xc, %eax
+4:
+	/* Maximum offset at 64mb to be safe */
+	andl	$0x3ffffff, %eax
+	movl	%ebp, %ebx
+	addl	%eax, %ebx
+#else
 	movl	%ebp, %ebx
+#endif
 	movl	BP_kernel_alignment(%esi), %eax
 	decl	%eax
 	addl    %eax, %ebx
 	notl	%eax
 	andl    %eax, %ebx
+
+	/* LOAD_PHSYICAL_ADDR is the minimum safe address we can
+	 * decompress at. */
+	cmpl	$LOAD_PHYSICAL_ADDR, %ebx
+	jae	1f
+	movl	$LOAD_PHYSICAL_ADDR, %ebx
+1:
 #else
 	movl	$LOAD_PHYSICAL_ADDR, %ebx
 #endif
diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
index 35af09d..6a05219 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -90,6 +90,13 @@ ENTRY(startup_32)
 	addl	%eax, %ebx
 	notl	%eax
 	andl	%eax, %ebx
+
+	/* LOAD_PHYSICAL_ADDR is the minimum safe address we can
+	 * decompress at. */
+	cmpl	$LOAD_PHYSICAL_ADDR, %ebx
+	jae	1f
+	movl	$LOAD_PHYSICAL_ADDR, %ebx
+1:
 #else
 	movl	$LOAD_PHYSICAL_ADDR, %ebx
 #endif
@@ -191,7 +198,7 @@ no_longmode:
 	 * it may change in the future.
 	 */
 	.code64
-	.org 0x200
+	.org 0x300
 ENTRY(startup_64)
 	/*
 	 * We come here either from startup_32 or directly from a
@@ -232,6 +239,13 @@ ENTRY(startup_64)
 	addq	%rax, %rbp
 	notq	%rax
 	andq	%rax, %rbp
+
+	/* LOAD_PHYSICAL_ADDR is the minimum safe address we can
+	 * decompress at. */
+	cmpq	$LOAD_PHYSICAL_ADDR, %rbp
+	jae	1f
+	movq	$LOAD_PHYSICAL_ADDR, %rbp
+1:
 #else
 	movq	$LOAD_PHYSICAL_ADDR, %rbp
 #endif
diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
index 4729b2b..d1fabba 100644
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -100,6 +100,10 @@ enum fixed_addresses {
 #endif
 #ifdef CONFIG_X86_F00F_BUG
 	FIX_F00F_IDT,	/* Virtual mapping for IDT */
+#else
+#ifdef CONFIG_RANDOMIZE_BASE
+	FIX_RANDOM_IDT, /* Virtual mapping for IDT */
+#endif
 #endif
 #ifdef CONFIG_X86_CYCLONE_TIMER
 	FIX_CYCLONE_TIMER, /*cyclone timer register*/
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index b9b6716..5672ad0 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -872,6 +872,13 @@ void __init trap_init(void)
 	set_bit(SYSCALL_VECTOR, used_vectors);
 #endif
 
+#if defined(CONFIG_RANDOMIZE_BASE) && !defined(CONFIG_X86_F00F_BUG)
+	__set_fixmap(FIX_RANDOM_IDT, __pa(&idt_table), PAGE_KERNEL_RO);
+
+	/* Update the IDT descriptor. It will be reloaded in cpu_init() */
+	idt_descr.address = fix_to_virt(FIX_RANDOM_IDT);
+#endif
+
 	/*
 	 * Should be a barrier for any external CPU state:
 	 */
diff --git a/kernel/printk.c b/kernel/printk.c
index da8ca81..283434f 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -262,9 +262,9 @@ static inline void boot_delay_msec(void)
 #endif
 
 #ifdef CONFIG_SECURITY_DMESG_RESTRICT
-int dmesg_restrict = 1;
+int dmesg_restrict __read_mostly = 1;
 #else
-int dmesg_restrict;
+int dmesg_restrict __read_mostly;
 #endif
 
 static int syslog_action_restricted(int type)
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 1d659d7..0d8da65 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -797,7 +797,11 @@ char *uuid_string(char *buf, char *end, const u8 *addr,
 	return string(buf, end, uuid, spec);
 }
 
+#ifdef CONFIG_RANDOMIZE_BASE
+int kptr_restrict __read_mostly = 1;
+#else
 int kptr_restrict __read_mostly;
+#endif
 
 /*
  * Show a '%p' thing.  A kernel extension is that the '%p' is followed
diff --git a/security/Kconfig b/security/Kconfig
index 95accd4..ffabef0 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -72,7 +72,7 @@ config KEYS_DEBUG_PROC_KEYS
 
 config SECURITY_DMESG_RESTRICT
 	bool "Restrict unprivileged access to the kernel syslog"
-	default n
+	default RANDOMIZE_BASE
 	help
 	  This enforces restrictions on unprivileged users reading the kernel
 	  syslog via dmesg(8).



^ permalink raw reply related	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 20:31 [RFC][PATCH] Randomize kernel base address on boot Dan Rosenberg
@ 2011-05-24 21:02 ` Ingo Molnar
  2011-05-24 22:55   ` Dan Rosenberg
  2011-05-24 21:16 ` Ingo Molnar
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 95+ messages in thread
From: Ingo Molnar @ 2011-05-24 21:02 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, pageexec


* Dan Rosenberg <drosenberg@vsecurity.com> wrote:

> This introduces CONFIG_RANDOMIZE_BASE, which randomizes the address at
> which the kernel is decompressed at boot as a security feature that
> deters exploit attempts relying on knowledge of the location of kernel
> internals.  The default values of the kptr_restrict and dmesg_restrict
> sysctls are set to (1) when this is enabled, since hiding kernel
> pointers is necessary to preserve the secrecy of the randomized base
> address.

That was quick! :-)

> This feature also uses a fixed mapping to move the IDT (if not already
> done as a fix for the F00F bug), to avoid exposing the location of
> kernel internals relative to the original IDT.  This has the additional
> security benefit of marking the new virtual address of the IDT
> read-only.

Btw., as i suggested before the IDT should be made percpu, that way we could 
split out and evaluate the IDT change independently of any security 
considerations, as a potential scalability improvement. Makes the decision 
easier because right now moving the IDT to a 4K TLB increases the kernel's TLB 
footprint a tiny bit.

> Entropy is generated using the RDRAND instruction if it is supported. If not, 
> then RDTSC is used, if supported. If neither RDRAND nor RDTSC are supported, 
> then no randomness is introduced. Support for the CPUID instruction is 
> required to check for the availability of these two instructions.

Btw., i'd suggest to fall back not to zero but to something system specific 
like RAM size or a BIOS signature such as the contents of 0xf0000 or so. This, 
while clearly not random, will at least *somewhat* randomize the kernel against 
remote attackers who do not know the RAM size or the system type.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 20:31 [RFC][PATCH] Randomize kernel base address on boot Dan Rosenberg
  2011-05-24 21:02 ` Ingo Molnar
@ 2011-05-24 21:16 ` Ingo Molnar
  2011-05-24 23:00   ` Dan Rosenberg
  2011-05-24 23:06   ` H. Peter Anvin
  2011-05-24 21:46 ` Brian Gerst
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 95+ messages in thread
From: Ingo Molnar @ 2011-05-24 21:16 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, pageexec


* Dan Rosenberg <drosenberg@vsecurity.com> wrote:

> Comments/Questions:
> 
> * Since RDRAND is relatively new, only the most recent version of
> binutils supports assembling it.  To avoid breaking builds for people
> who use older toolchains but want this feature, I hardcoded the opcodes.
> If anyone has a better approach, please let me know.

This is generally the best approach. Maybe mention it here:

> +	/* rdrand %eax */
> +	.byte	0x0f, 0xc7, 0xf0

... that this is done to work on older GAS as well. Putting that into 
changelogs is good, putting it into comments is better.

> * I chose to mimic the F00F bugfix behavior for moving the IDT, since it
> required very little code and has the additional benefit of making the
> IDT read-only. Ingo Molnar's suggestion of allocating per-cpu IDTs
> instead is still on the table, and I'd like to get feedback on this.

ok, good for an RFC patch.

> * In order to increase the entropy for the randomized base, I changed
> the default value of CONFIG_PHYSICAL_ALIGN back to 2mb.  It had
> previously been raised to 16mb as a hack so that relocatable kernels
> wouldn't load below that minimum.  I address this by changing the
> meaning of CONFIG_PHYSICAL_START such that it now represents a minimum
> address that relocatable kernels can be loaded at (rather than being
> ignored by relocatable kernels).  So, if a relocatable kernel determines
> it should be loaded at an address below CONFIG_PHYSICAL_START (which
> defaults to 16mb), I just bump it up.

This would need a real fix, right? The PHYSICAL_ALIGN hack looks worth fixing 
in its own right.

> * I would appreciate guidance on safe values for the highest addresses
> we can safely load the kernel at, on both 32-bit and 64-bit. This
> version uses 64mb (0x4000000) for 32-bit, and worked well in testing.

This depends on the memory map. In practice most x86 systems start with a big 
chunk of RAM up to end of RAM or 3GB, whichever comes first. Holes typically 
start at 3GB or higher.

On some systems holes can be pretty low as well - you'd have to research e820 
maps submitted to lkml to see how common this is - but it's not terribly 
common.

Some really old systems might have a hole between 15MB-16MB - but that's not an 
issue if we load at 16 MB or higher.

> * CONFIG_RANDOMIZE_BASE automatically sets the default value of kptr_restrict 
> and dmesg_restrict to 1, since it's nonsensical to use this without the other 
> two.  I considered removing CONFIG_SECURITY_DMESG_RESTRICT altogether (it 
> currently sets the default value for dmesg_restrict), but just in case 
> distros want to keep the CONFIG as a toggle switch but don't want to use 
> CONFIG_RANDOMIZE_BASE, I kept it around.  So, now CONFIG_RANDOMIZE_BASE sets 
> the default value for CONFIG_SECURITY_DMESG_RESTRICT.

No, the right solution is what i suggested a few mails ago: /proc/kallsyms (and 
other RIP printing places) should report the non-randomized RIP.

That way we do not have to change the kptr_restrict default and tools will 
continue to work ...

> * x86-64 is still "to-do". Because it calculates the kernel text address 
> twice, this may be a little trickier.

Note that 64-bit is obviously a must-have condition for the eventual acceptance 
of this patch.

> * Finding a middle ground instead of the current "all-or-nothing" behavior of 
> kptr_restrict that allows perf users to use this feature is future work.

Well, for perf we need to transform back the RIPs that get passed along in the 
stack-dump/call-chain code, see:

 arch/x86/kernel/dumpstack_64.c
 arch/x86/kernel/dumpstack.c
 arch/x86/kernel/dumpstack_32.c

That, combined with /proc/kallsyms unrandomization makes 'perf top' will just 
work and produce non-randomized RIPs.

The canonical RIP to report is the one that the kernel would have if it was 
loaded non-randomized.

> * Tested by repeatedly booting and observing kallsyms output on both i386.  
> Passed the "looks random to me" test, and saw no bad behavior. Tested that 
> changing CONFIG_PHYSICAL_ALIGN to 2mb still boots and runs fine on amd64.

Please run it over rngtest to measure how much true randomness is in it, on 
your testbox.

> * Is it worth bothering to look for alternate sources of entropy if
> RDTSC isn't available?

No, if you do the system-specific BIOS signature trick i think it's adequate.

> * Could use testing of CPU hotplugging and suspend/resume.

and kexec/crashdump. and perf ;-)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 20:31 [RFC][PATCH] Randomize kernel base address on boot Dan Rosenberg
  2011-05-24 21:02 ` Ingo Molnar
  2011-05-24 21:16 ` Ingo Molnar
@ 2011-05-24 21:46 ` Brian Gerst
  2011-05-24 23:01   ` Dan Rosenberg
  2011-05-24 22:31 ` H. Peter Anvin
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 95+ messages in thread
From: Brian Gerst @ 2011-05-24 21:46 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, Ingo Molnar, pageexec

On Tue, May 24, 2011 at 4:31 PM, Dan Rosenberg <drosenberg@vsecurity.com> wrote:
> This introduces CONFIG_RANDOMIZE_BASE, which randomizes the address at
> which the kernel is decompressed at boot as a security feature that
> deters exploit attempts relying on knowledge of the location of kernel
> internals.  The default values of the kptr_restrict and dmesg_restrict
> sysctls are set to (1) when this is enabled, since hiding kernel
> pointers is necessary to preserve the secrecy of the randomized base
> address.
>
> This feature also uses a fixed mapping to move the IDT (if not already
> done as a fix for the F00F bug), to avoid exposing the location of
> kernel internals relative to the original IDT.  This has the additional
> security benefit of marking the new virtual address of the IDT
> read-only.
>
> Entropy is generated using the RDRAND instruction if it is supported. If
> not, then RDTSC is used, if supported. If neither RDRAND nor RDTSC are
> supported, then no randomness is introduced. Support for the CPUID
> instruction is required to check for the availability of these two
> instructions.
>
> Thanks to everyone who contributed helpful suggestions and feedback so
> far.
>
> Comments/Questions:
>
> * Since RDRAND is relatively new, only the most recent version of
> binutils supports assembling it.  To avoid breaking builds for people
> who use older toolchains but want this feature, I hardcoded the opcodes.
> If anyone has a better approach, please let me know.
>
> * I chose to mimic the F00F bugfix behavior for moving the IDT, since it
> required very little code and has the additional benefit of making the
> IDT read-only. Ingo Molnar's suggestion of allocating per-cpu IDTs
> instead is still on the table, and I'd like to get feedback on this.
>
> * In order to increase the entropy for the randomized base, I changed
> the default value of CONFIG_PHYSICAL_ALIGN back to 2mb.  It had
> previously been raised to 16mb as a hack so that relocatable kernels
> wouldn't load below that minimum.  I address this by changing the
> meaning of CONFIG_PHYSICAL_START such that it now represents a minimum
> address that relocatable kernels can be loaded at (rather than being
> ignored by relocatable kernels).  So, if a relocatable kernel determines
> it should be loaded at an address below CONFIG_PHYSICAL_START (which
> defaults to 16mb), I just bump it up.
>
> * I would appreciate guidance on safe values for the highest addresses
> we can safely load the kernel at, on both 32-bit and 64-bit. This
> version uses 64mb (0x4000000) for 32-bit, and worked well in testing.
>
> * CONFIG_RANDOMIZE_BASE automatically sets the default value of
> kptr_restrict and dmesg_restrict to 1, since it's nonsensical to use
> this without the other two.  I considered removing
> CONFIG_SECURITY_DMESG_RESTRICT altogether (it currently sets the default
> value for dmesg_restrict), but just in case distros want to keep the
> CONFIG as a toggle switch but don't want to use CONFIG_RANDOMIZE_BASE, I
> kept it around.  So, now CONFIG_RANDOMIZE_BASE sets the default value
> for CONFIG_SECURITY_DMESG_RESTRICT.
>
> * x86-64 is still "to-do". Because it calculates the kernel text address
> twice, this may be a little trickier.

This trick doesn't work as you may expect on 64-bit.  You are
relocating the physical image of the kernel, but the kernel actually
runs from a fixed virtual mapping.  This would require adding the
relocation code that 32-bit uses, so the virtual address can be
changed.

--
Brian Gerst

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 20:31 [RFC][PATCH] Randomize kernel base address on boot Dan Rosenberg
                   ` (2 preceding siblings ...)
  2011-05-24 21:46 ` Brian Gerst
@ 2011-05-24 22:31 ` H. Peter Anvin
  2011-05-24 23:04   ` Dan Rosenberg
  2011-05-24 23:14   ` H. Peter Anvin
  2011-05-24 23:08 ` Dan Rosenberg
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-24 22:31 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, Arjan van de Ven, Andrew Morton,
	Valdis.Kletnieks, Ingo Molnar, pageexec

On 05/24/2011 01:31 PM, Dan Rosenberg wrote:
> This introduces CONFIG_RANDOMIZE_BASE, which randomizes the address at
> which the kernel is decompressed at boot as a security feature that
> deters exploit attempts relying on knowledge of the location of kernel
> internals.  The default values of the kptr_restrict and dmesg_restrict
> sysctls are set to (1) when this is enabled, since hiding kernel
> pointers is necessary to preserve the secrecy of the randomized base
> address.
> 
> This feature also uses a fixed mapping to move the IDT (if not already
> done as a fix for the F00F bug), to avoid exposing the location of
> kernel internals relative to the original IDT.  This has the additional
> security benefit of marking the new virtual address of the IDT
> read-only.

As written, I think this is unsafe, simply because the kernel has no
idea what memory is actually safe to relocate into, and your code
doesn't actually make any attempt at doing so.

The fact that you change CONFIG_PHYSICAL_ALIGN is particularly
devastating, and will introduce boot failures on real systems.

For this to be acceptable, you need to at the very least:

1. Verify the in the address map passed to the kernel where it is safe
   to locate the kernel;
2. Not introduce a performance regression (we avoid locating in the
   bottom 16 MiB for performance reasons, except on very small systems);
3. Make sure not to break kdump.

Arguably this is really something that would be *much* better done in
the bootloader, but given that the dominant boot loader for Linux is
Grub, I don't expect that anything will ever happen until the cows come
home :(

	-hpa

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 21:02 ` Ingo Molnar
@ 2011-05-24 22:55   ` Dan Rosenberg
  0 siblings, 0 replies; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-24 22:55 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, pageexec

On Tue, 2011-05-24 at 23:02 +0200, Ingo Molnar wrote:
> * Dan Rosenberg <drosenberg@vsecurity.com> wrote:
> 
> > This introduces CONFIG_RANDOMIZE_BASE, which randomizes the address at
> > which the kernel is decompressed at boot as a security feature that
> > deters exploit attempts relying on knowledge of the location of kernel
> > internals.  The default values of the kptr_restrict and dmesg_restrict
> > sysctls are set to (1) when this is enabled, since hiding kernel
> > pointers is necessary to preserve the secrecy of the randomized base
> > address.
> 
> That was quick! :-)
> 
> > This feature also uses a fixed mapping to move the IDT (if not already
> > done as a fix for the F00F bug), to avoid exposing the location of
> > kernel internals relative to the original IDT.  This has the additional
> > security benefit of marking the new virtual address of the IDT
> > read-only.
> 
> Btw., as i suggested before the IDT should be made percpu, that way we could 
> split out and evaluate the IDT change independently of any security 
> considerations, as a potential scalability improvement. Makes the decision 
> easier because right now moving the IDT to a 4K TLB increases the kernel's TLB 
> footprint a tiny bit.
> 

Alright, I'll start working on this.

> > Entropy is generated using the RDRAND instruction if it is supported. If not, 
> > then RDTSC is used, if supported. If neither RDRAND nor RDTSC are supported, 
> > then no randomness is introduced. Support for the CPUID instruction is 
> > required to check for the availability of these two instructions.
> 
> Btw., i'd suggest to fall back not to zero but to something system specific 
> like RAM size or a BIOS signature such as the contents of 0xf0000 or so. This, 
> while clearly not random, will at least *somewhat* randomize the kernel against 
> remote attackers who do not know the RAM size or the system type.
> 

Good idea, will do.

-Dan


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 21:16 ` Ingo Molnar
@ 2011-05-24 23:00   ` Dan Rosenberg
  2011-05-25 11:23     ` Ingo Molnar
  2011-05-24 23:06   ` H. Peter Anvin
  1 sibling, 1 reply; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-24 23:00 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, pageexec

On Tue, 2011-05-24 at 23:16 +0200, Ingo Molnar wrote:
> * Dan Rosenberg <drosenberg@vsecurity.com> wrote:
> 
> > Comments/Questions:
> > 
> > * Since RDRAND is relatively new, only the most recent version of
> > binutils supports assembling it.  To avoid breaking builds for people
> > who use older toolchains but want this feature, I hardcoded the opcodes.
> > If anyone has a better approach, please let me know.
> 
> This is generally the best approach. Maybe mention it here:
> 
> > +	/* rdrand %eax */
> > +	.byte	0x0f, 0xc7, 0xf0
> 
> ... that this is done to work on older GAS as well. Putting that into 
> changelogs is good, putting it into comments is better.
> 

Will do.


> > * In order to increase the entropy for the randomized base, I changed
> > the default value of CONFIG_PHYSICAL_ALIGN back to 2mb.  It had
> > previously been raised to 16mb as a hack so that relocatable kernels
> > wouldn't load below that minimum.  I address this by changing the
> > meaning of CONFIG_PHYSICAL_START such that it now represents a minimum
> > address that relocatable kernels can be loaded at (rather than being
> > ignored by relocatable kernels).  So, if a relocatable kernel determines
> > it should be loaded at an address below CONFIG_PHYSICAL_START (which
> > defaults to 16mb), I just bump it up.
> 
> This would need a real fix, right? The PHYSICAL_ALIGN hack looks worth fixing 
> in its own right.
> 

I'm not sure of a better way to do this than what I've done, which is
essentially introduce a lower bound on the start location rather than
restricting the alignment.  Suggestions welcome.

> > * CONFIG_RANDOMIZE_BASE automatically sets the default value of kptr_restrict 
> > and dmesg_restrict to 1, since it's nonsensical to use this without the other 
> > two.  I considered removing CONFIG_SECURITY_DMESG_RESTRICT altogether (it 
> > currently sets the default value for dmesg_restrict), but just in case 
> > distros want to keep the CONFIG as a toggle switch but don't want to use 
> > CONFIG_RANDOMIZE_BASE, I kept it around.  So, now CONFIG_RANDOMIZE_BASE sets 
> > the default value for CONFIG_SECURITY_DMESG_RESTRICT.
> 
> No, the right solution is what i suggested a few mails ago: /proc/kallsyms (and 
> other RIP printing places) should report the non-randomized RIP.
> 
> That way we do not have to change the kptr_restrict default and tools will 
> continue to work ...
> 

Ok, I'll do it this way, and leave the kptr_restrict default to 0.  But
I still think having the dmesg_restrict default depend on randomization
makes sense, since kernel .text is explicitly revealed in the syslog.

> > * x86-64 is still "to-do". Because it calculates the kernel text address 
> > twice, this may be a little trickier.
> 
> Note that 64-bit is obviously a must-have condition for the eventual acceptance 
> of this patch.

Of course, just wanted early feedback.

> 
> > * Finding a middle ground instead of the current "all-or-nothing" behavior of 
> > kptr_restrict that allows perf users to use this feature is future work.
> 
> Well, for perf we need to transform back the RIPs that get passed along in the 
> stack-dump/call-chain code, see:
> 
>  arch/x86/kernel/dumpstack_64.c
>  arch/x86/kernel/dumpstack.c
>  arch/x86/kernel/dumpstack_32.c
> 
> That, combined with /proc/kallsyms unrandomization makes 'perf top' will just 
> work and produce non-randomized RIPs.
> 
> The canonical RIP to report is the one that the kernel would have if it was 
> loaded non-randomized.
> 

Will do.

> > * Tested by repeatedly booting and observing kallsyms output on both i386.  
> > Passed the "looks random to me" test, and saw no bad behavior. Tested that 
> > changing CONFIG_PHYSICAL_ALIGN to 2mb still boots and runs fine on amd64.
> 
> Please run it over rngtest to measure how much true randomness is in it, on 
> your testbox.
> 

Will do.

> > * Could use testing of CPU hotplugging and suspend/resume.
> 
> and kexec/crashdump. and perf ;-)
> 

Will do.

Thanks very much for the feedback.


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 21:46 ` Brian Gerst
@ 2011-05-24 23:01   ` Dan Rosenberg
  0 siblings, 0 replies; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-24 23:01 UTC (permalink / raw)
  To: Brian Gerst
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, Ingo Molnar, pageexec


> 
> This trick doesn't work as you may expect on 64-bit.  You are
> relocating the physical image of the kernel, but the kernel actually
> runs from a fixed virtual mapping.  This would require adding the
> relocation code that 32-bit uses, so the virtual address can be
> changed.
> 

Noted, thanks, I'll be sure to not waste my time when I start working on
64-bit.

-Dan

> --
> Brian Gerst



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 22:31 ` H. Peter Anvin
@ 2011-05-24 23:04   ` Dan Rosenberg
  2011-05-24 23:07     ` H. Peter Anvin
  2011-05-24 23:14   ` H. Peter Anvin
  1 sibling, 1 reply; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-24 23:04 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, Arjan van de Ven, Andrew Morton,
	Valdis.Kletnieks, Ingo Molnar, pageexec

On Tue, 2011-05-24 at 15:31 -0700, H. Peter Anvin wrote:
> On 05/24/2011 01:31 PM, Dan Rosenberg wrote:
> > This introduces CONFIG_RANDOMIZE_BASE, which randomizes the address at
> > which the kernel is decompressed at boot as a security feature that
> > deters exploit attempts relying on knowledge of the location of kernel
> > internals.  The default values of the kptr_restrict and dmesg_restrict
> > sysctls are set to (1) when this is enabled, since hiding kernel
> > pointers is necessary to preserve the secrecy of the randomized base
> > address.
> > 
> > This feature also uses a fixed mapping to move the IDT (if not already
> > done as a fix for the F00F bug), to avoid exposing the location of
> > kernel internals relative to the original IDT.  This has the additional
> > security benefit of marking the new virtual address of the IDT
> > read-only.
> 
> As written, I think this is unsafe, simply because the kernel has no
> idea what memory is actually safe to relocate into, and your code
> doesn't actually make any attempt at doing so.
> 
> The fact that you change CONFIG_PHYSICAL_ALIGN is particularly
> devastating, and will introduce boot failures on real systems.
> 
> For this to be acceptable, you need to at the very least:
> 
> 1. Verify the in the address map passed to the kernel where it is safe
>    to locate the kernel;

I'll do this, thanks.

> 2. Not introduce a performance regression (we avoid locating in the
>    bottom 16 MiB for performance reasons, except on very small systems);

I altered the boot code so that it uses CONFIG_PHYSICAL_START, which
defaults to 16 MiB, as a lower bound on location.  So nothing will ever
get loaded below there, and I still can take advantage of higher
alignment granularity.  Are there other problems I'm not anticipating?

> 3. Make sure not to break kdump.
> 

Ok, I'll be sure to add this to the list of things to test.

Thanks for the feedback.

-Dan



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 21:16 ` Ingo Molnar
  2011-05-24 23:00   ` Dan Rosenberg
@ 2011-05-24 23:06   ` H. Peter Anvin
  2011-05-25 14:03     ` Dan Rosenberg
  1 sibling, 1 reply; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-24 23:06 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Dan Rosenberg, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, torvalds, adobriyan, penberg, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, pageexec

On 05/24/2011 02:16 PM, Ingo Molnar wrote:
> 
> On some systems holes can be pretty low as well - you'd have to research e820 
> maps submitted to lkml to see how common this is - but it's not terribly 
> common.
> 
> Some really old systems might have a hole between 15MB-16MB - but that's not an 
> issue if we load at 16 MB or higher.
> 

It definitely happens, and not just at 15-16 MiB either.

Doing this without actually consulting the memory map is dangerous as
hell; plus you have to verify that you're not clobbering anything else,
like the command line, initramfs or the linked list of data.

	-hpa

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 23:04   ` Dan Rosenberg
@ 2011-05-24 23:07     ` H. Peter Anvin
  2011-05-24 23:34       ` Dan Rosenberg
  0 siblings, 1 reply; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-24 23:07 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, Arjan van de Ven, Andrew Morton,
	Valdis.Kletnieks, Ingo Molnar, pageexec

On 05/24/2011 04:04 PM, Dan Rosenberg wrote:
> 
>> 2. Not introduce a performance regression (we avoid locating in the
>>    bottom 16 MiB for performance reasons, except on very small systems);
> 
> I altered the boot code so that it uses CONFIG_PHYSICAL_START, which
> defaults to 16 MiB, as a lower bound on location.  So nothing will ever
> get loaded below there, and I still can take advantage of higher
> alignment granularity.  Are there other problems I'm not anticipating?
> 

Please look at the discussion as to what led us to do things this way.

	-hpa

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 20:31 [RFC][PATCH] Randomize kernel base address on boot Dan Rosenberg
                   ` (3 preceding siblings ...)
  2011-05-24 22:31 ` H. Peter Anvin
@ 2011-05-24 23:08 ` Dan Rosenberg
  2011-05-25  2:05   ` Dan Rosenberg
  2011-05-26 20:01 ` Vivek Goyal
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-24 23:08 UTC (permalink / raw)
  To: Tony Luck
  Cc: linux-kernel, davej, kees.cook, davem, eranian, torvalds,
	adobriyan, penberg, hpa, Arjan van de Ven, Andrew Morton,
	Valdis.Kletnieks, Ingo Molnar, pageexec

On Tue, 2011-05-24 at 16:31 -0400, Dan Rosenberg wrote:
> This introduces CONFIG_RANDOMIZE_BASE, which randomizes the address at
> which the kernel is decompressed at boot as a security feature that
> deters exploit attempts relying on knowledge of the location of kernel
> internals.  The default values of the kptr_restrict and dmesg_restrict
> sysctls are set to (1) when this is enabled, since hiding kernel
> pointers is necessary to preserve the secrecy of the randomized base
> address.

> diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
> index 67a655a..2680db0 100644
> --- a/arch/x86/boot/compressed/head_32.S
> +++ b/arch/x86/boot/compressed/head_32.S
> @@ -69,12 +69,75 @@ ENTRY(startup_32)
>   */
>  
>  #ifdef CONFIG_RELOCATABLE
> +#ifdef CONFIG_RANDOMIZE_BASE
> +
> +	/* Standard check for cpuid */
> +	pushfl
> +	popl	%eax
> +	movl	%eax, %ebx
> +	xorl	$0x200000, %eax
> +	pushl	%eax
> +	popfl
> +	pushfl
> +	popl	%eax
> +	cmpl	%eax, %ebx
> +	jz	4f
> +
> +	/* Check for cpuid 1 */
> +	movl	$0x0, %eax
> +	cpuid
> +	cmpl	$0x1, %eax
> +	jb	4f
> +
> +	movl	$0x1, %eax
> +	cpuid
> +	xor	%eax, %eax
> +
> +	/* RDRAND is bit 30 */
> +	testl	$0x4000000, %ecx
> +	jnz	1f
> +
> +	/* RDTSC is bit 4 */
> +	testl	$0x10, %edx
> +	jnz	3f
> +
> +	/* Nothing is supported */
> +	jmp	4f
> +1:
> +	/* RDRAND sets carry bit on success, otherwise we should try
> +	 * again. */
> +	movl	$0x10, %ecx
> +2:
> +	/* rdrand %eax */
> +	.byte	0x0f, 0xc7, 0xf0
> +	jc	4f
> +	loop	2b
> +
> +	/* Fall through: if RDRAND is supported but fails, use RDTSC,
> +	 * which is guaranteed to be supported. */
> +3:
> +	rdtsc
> +	shll	$0xc, %eax
> +4:
> +	/* Maximum offset at 64mb to be safe */
> +	andl	$0x3ffffff, %eax
> +	movl	%ebp, %ebx
> +	addl	%eax, %ebx
> +#else
>  	movl	%ebp, %ebx
> +#endif
>  	movl	BP_kernel_alignment(%esi), %eax
>  	decl	%eax
>  	addl    %eax, %ebx
>  	notl	%eax
>  	andl    %eax, %ebx
> +
> +	/* LOAD_PHSYICAL_ADDR is the minimum safe address we can
> +	 * decompress at. */
> +	cmpl	$LOAD_PHYSICAL_ADDR, %ebx
> +	jae	1f
> +	movl	$LOAD_PHYSICAL_ADDR, %ebx
> +1:
>  #else
>  	movl	$LOAD_PHYSICAL_ADDR, %ebx
>  #endif
> diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
> index 35af09d..6a05219 100644
> --- a/arch/x86/boot/compressed/head_64.S
> +++ b/arch/x86/boot/compressed/head_64.S
> @@ -90,6 +90,13 @@ ENTRY(startup_32)
>  	addl	%eax, %ebx
>  	notl	%eax
>  	andl	%eax, %ebx
> +
> +	/* LOAD_PHYSICAL_ADDR is the minimum safe address we can
> +	 * decompress at. */
> +	cmpl	$LOAD_PHYSICAL_ADDR, %ebx
> +	jae	1f
> +	movl	$LOAD_PHYSICAL_ADDR, %ebx
> +1:
>  #else
>  	movl	$LOAD_PHYSICAL_ADDR, %ebx
>  #endif
> @@ -191,7 +198,7 @@ no_longmode:
>  	 * it may change in the future.
>  	 */
>  	.code64
> -	.org 0x200
> +	.org 0x300
>  ENTRY(startup_64)
>  	/*
>  	 * We come here either from startup_32 or directly from a
> @@ -232,6 +239,13 @@ ENTRY(startup_64)
>  	addq	%rax, %rbp
>  	notq	%rax
>  	andq	%rax, %rbp
> +
> +	/* LOAD_PHYSICAL_ADDR is the minimum safe address we can
> +	 * decompress at. */
> +	cmpq	$LOAD_PHYSICAL_ADDR, %rbp
> +	jae	1f
> +	movq	$LOAD_PHYSICAL_ADDR, %rbp
> +1:
>  #else
>  	movq	$LOAD_PHYSICAL_ADDR, %rbp
>  #endif

Thanks to Kees Cook for noticing that I didn't clear %eax before jumping
to my "nothing supported" (4) label.  This would have just used the
flags as "randomness", but it's still wrong and I'll fix it.  Next
version will have a fallback of using the BIOS signature instead anyway.

-Dan


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 22:31 ` H. Peter Anvin
  2011-05-24 23:04   ` Dan Rosenberg
@ 2011-05-24 23:14   ` H. Peter Anvin
  1 sibling, 0 replies; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-24 23:14 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, Arjan van de Ven, Andrew Morton,
	Valdis.Kletnieks, Ingo Molnar, pageexec

On 05/24/2011 03:31 PM, H. Peter Anvin wrote:
> 
> Arguably this is really something that would be *much* better done in
> the bootloader, but given that the dominant boot loader for Linux is
> Grub, I don't expect that anything will ever happen until the cows come
> home :(
> 

This pretty much means we need an opt-out for this.  I think we need
this both in the form of a boot protocol flag bit (for the case where
the boot loader knows what it's doing, and what the kernel to stay put;
perhaps it has already randomized) and a kernel command-line option
(which can be parsed early and set the above flag.)

	-hpa

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 23:07     ` H. Peter Anvin
@ 2011-05-24 23:34       ` Dan Rosenberg
  2011-05-24 23:36         ` H. Peter Anvin
  0 siblings, 1 reply; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-24 23:34 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, Arjan van de Ven, Andrew Morton,
	Valdis.Kletnieks, Ingo Molnar, pageexec

On Tue, 2011-05-24 at 16:07 -0700, H. Peter Anvin wrote:
> On 05/24/2011 04:04 PM, Dan Rosenberg wrote:
> > 
> >> 2. Not introduce a performance regression (we avoid locating in the
> >>    bottom 16 MiB for performance reasons, except on very small systems);
> > 
> > I altered the boot code so that it uses CONFIG_PHYSICAL_START, which
> > defaults to 16 MiB, as a lower bound on location.  So nothing will ever
> > get loaded below there, and I still can take advantage of higher
> > alignment granularity.  Are there other problems I'm not anticipating?
> > 
> 
> Please look at the discussion as to what led us to do things this way.
> 

Would you be able to point me to said discussion?  The only thing I can
find is this:

http://marc.info/?l=linux-kernel&m=124173552516435&w=2

This set PHYSICAL_START at 16 MB and alignment at 2/4 MB.  Then, three
days later, this was committed:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ceefccc93932b920a8ec6f35f596db05202a12fe

This sets the alignment to 16 MB, with the only justification being that
relocatable kernels also need to start above 16 MB.

Thanks,
Dan


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 23:34       ` Dan Rosenberg
@ 2011-05-24 23:36         ` H. Peter Anvin
  0 siblings, 0 replies; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-24 23:36 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, Arjan van de Ven, Andrew Morton,
	Valdis.Kletnieks, Ingo Molnar, pageexec

On 05/24/2011 04:34 PM, Dan Rosenberg wrote:
> On Tue, 2011-05-24 at 16:07 -0700, H. Peter Anvin wrote:
>> On 05/24/2011 04:04 PM, Dan Rosenberg wrote:
>>>
>>>> 2. Not introduce a performance regression (we avoid locating in the
>>>>    bottom 16 MiB for performance reasons, except on very small systems);
>>>
>>> I altered the boot code so that it uses CONFIG_PHYSICAL_START, which
>>> defaults to 16 MiB, as a lower bound on location.  So nothing will ever
>>> get loaded below there, and I still can take advantage of higher
>>> alignment granularity.  Are there other problems I'm not anticipating?
>>>
>>
>> Please look at the discussion as to what led us to do things this way.
>>
> 
> Would you be able to point me to said discussion?  The only thing I can
> find is this:
> 
> http://marc.info/?l=linux-kernel&m=124173552516435&w=2
> 
> This set PHYSICAL_START at 16 MB and alignment at 2/4 MB.  Then, three
> days later, this was committed:
> 
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ceefccc93932b920a8ec6f35f596db05202a12fe
> 
> This sets the alignment to 16 MB, with the only justification being that
> relocatable kernels also need to start above 16 MB.
> 

I think those patches came after the discussion were already over.  I'll
try to look for it.

	-hpa

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 23:08 ` Dan Rosenberg
@ 2011-05-25  2:05   ` Dan Rosenberg
  0 siblings, 0 replies; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-25  2:05 UTC (permalink / raw)
  To: Tony Luck
  Cc: linux-kernel, davej, kees.cook, davem, eranian, torvalds,
	adobriyan, penberg, hpa, Arjan van de Ven, Andrew Morton,
	Valdis.Kletnieks, Ingo Molnar, pageexec

On Tue, 2011-05-24 at 19:08 -0400, Dan Rosenberg wrote:
> On Tue, 2011-05-24 at 16:31 -0400, Dan Rosenberg wrote:
> > This introduces CONFIG_RANDOMIZE_BASE, which randomizes the address at
> > which the kernel is decompressed at boot as a security feature that
> > deters exploit attempts relying on knowledge of the location of kernel
> > internals.  The default values of the kptr_restrict and dmesg_restrict
> > sysctls are set to (1) when this is enabled, since hiding kernel
> > pointers is necessary to preserve the secrecy of the randomized base
> > address.
> 
> > diff --git a/arch/x86/boot/compressed/head_32.S b/arch/x86/boot/compressed/head_32.S
> > index 67a655a..2680db0 100644
> > --- a/arch/x86/boot/compressed/head_32.S
> > +++ b/arch/x86/boot/compressed/head_32.S
> > @@ -69,12 +69,75 @@ ENTRY(startup_32)
> >   */
> >  
> >  #ifdef CONFIG_RELOCATABLE
> > +#ifdef CONFIG_RANDOMIZE_BASE
> > +
> > +	/* Standard check for cpuid */
> > +	pushfl
> > +	popl	%eax
> > +	movl	%eax, %ebx
> > +	xorl	$0x200000, %eax
> > +	pushl	%eax
> > +	popfl
> > +	pushfl
> > +	popl	%eax
> > +	cmpl	%eax, %ebx
> > +	jz	4f
> > +
> > +	/* Check for cpuid 1 */
> > +	movl	$0x0, %eax
> > +	cpuid
> > +	cmpl	$0x1, %eax
> > +	jb	4f
> > +
> > +	movl	$0x1, %eax
> > +	cpuid
> > +	xor	%eax, %eax
> > +
> > +	/* RDRAND is bit 30 */
> > +	testl	$0x4000000, %ecx
> > +	jnz	1f
> > +
> > +	/* RDTSC is bit 4 */
> > +	testl	$0x10, %edx
> > +	jnz	3f
> > +
> > +	/* Nothing is supported */
> > +	jmp	4f
> > +1:
> > +	/* RDRAND sets carry bit on success, otherwise we should try
> > +	 * again. */
> > +	movl	$0x10, %ecx
> > +2:
> > +	/* rdrand %eax */
> > +	.byte	0x0f, 0xc7, 0xf0
> > +	jc	4f
> > +	loop	2b
> > +
> > +	/* Fall through: if RDRAND is supported but fails, use RDTSC,
> > +	 * which is guaranteed to be supported. */
> > +3:
> > +	rdtsc
> > +	shll	$0xc, %eax
> > +4:
> > +	/* Maximum offset at 64mb to be safe */
> > +	andl	$0x3ffffff, %eax
> > +	movl	%ebp, %ebx
> > +	addl	%eax, %ebx
> > +#else
> >  	movl	%ebp, %ebx
> > +#endif
> >  	movl	BP_kernel_alignment(%esi), %eax
> >  	decl	%eax
> >  	addl    %eax, %ebx
> >  	notl	%eax
> >  	andl    %eax, %ebx
> > +
> > +	/* LOAD_PHSYICAL_ADDR is the minimum safe address we can
> > +	 * decompress at. */
> > +	cmpl	$LOAD_PHYSICAL_ADDR, %ebx
> > +	jae	1f
> > +	movl	$LOAD_PHYSICAL_ADDR, %ebx
> > +1:
> >  #else
> >  	movl	$LOAD_PHYSICAL_ADDR, %ebx
> >  #endif
> > diff --git a/arch/x86/boot/compressed/head_64.S b/arch/x86/boot/compressed/head_64.S
> > index 35af09d..6a05219 100644
> > --- a/arch/x86/boot/compressed/head_64.S
> > +++ b/arch/x86/boot/compressed/head_64.S
> > @@ -90,6 +90,13 @@ ENTRY(startup_32)
> >  	addl	%eax, %ebx
> >  	notl	%eax
> >  	andl	%eax, %ebx
> > +
> > +	/* LOAD_PHYSICAL_ADDR is the minimum safe address we can
> > +	 * decompress at. */
> > +	cmpl	$LOAD_PHYSICAL_ADDR, %ebx
> > +	jae	1f
> > +	movl	$LOAD_PHYSICAL_ADDR, %ebx
> > +1:
> >  #else
> >  	movl	$LOAD_PHYSICAL_ADDR, %ebx
> >  #endif
> > @@ -191,7 +198,7 @@ no_longmode:
> >  	 * it may change in the future.
> >  	 */
> >  	.code64
> > -	.org 0x200
> > +	.org 0x300
> >  ENTRY(startup_64)
> >  	/*
> >  	 * We come here either from startup_32 or directly from a
> > @@ -232,6 +239,13 @@ ENTRY(startup_64)
> >  	addq	%rax, %rbp
> >  	notq	%rax
> >  	andq	%rax, %rbp
> > +
> > +	/* LOAD_PHYSICAL_ADDR is the minimum safe address we can
> > +	 * decompress at. */
> > +	cmpq	$LOAD_PHYSICAL_ADDR, %rbp
> > +	jae	1f
> > +	movq	$LOAD_PHYSICAL_ADDR, %rbp
> > +1:
> >  #else
> >  	movq	$LOAD_PHYSICAL_ADDR, %rbp
> >  #endif
> 
> Thanks to Kees Cook for noticing that I didn't clear %eax before jumping
> to my "nothing supported" (4) label.  This would have just used the
> flags as "randomness", but it's still wrong and I'll fix it.  Next
> version will have a fallback of using the BIOS signature instead anyway.
> 

Also thanks to someone who prefers to remain nameless for pointing out
that this logic also results in the kernel being loaded at
LOAD_PHYSICAL_ADDR about one in four times (because it rounds up).  This
will be fixed as well.

-Dan


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 23:00   ` Dan Rosenberg
@ 2011-05-25 11:23     ` Ingo Molnar
  2011-05-25 14:20       ` Dan Rosenberg
  0 siblings, 1 reply; 95+ messages in thread
From: Ingo Molnar @ 2011-05-25 11:23 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, pageexec


* Dan Rosenberg <drosenberg@vsecurity.com> wrote:

> > No, the right solution is what i suggested a few mails ago: 
> > /proc/kallsyms (and other RIP printing places) should report the 
> > non-randomized RIP.
> > 
> > That way we do not have to change the kptr_restrict default and 
> > tools will continue to work ...
> 
> Ok, I'll do it this way, and leave the kptr_restrict default to 0.  
> But I still think having the dmesg_restrict default depend on 
> randomization makes sense, since kernel .text is explicitly 
> revealed in the syslog.

Hm, where is it revealed beyond intcall addresses, which ought to be 
handled if they are printed via %pK?

All such information leaks need to be fixed. (This will be the 
slowest part of the process i suspect - there's many channels.)

in the syslog we obviously want any RIPs converted to the canonical 
'unrandomized' address, so that it can be matched against 
/proc/kallsyms, etc. Their randomized value isnt very useful. That 
will also protect the randomization secret as a side effect.

The only thorny issue AFAICS are oopses. There's real value in having 
'raw' data from a crash (interpreting crashes is hard enough even 
without randomization!), OTOH we could keep most of the value of them 
by converting them back to canonical addresses.

This would be more or less easy to do for the RIP and the registers, 
but less obvious for the stack: a kernel pointer can lie on the stack 
at arbitrary alignment. On 64-bit we could probably detect them 
rather reliably based on the randomized prefix of kernel addresses:

[   32.946003] Stack:
[   32.946003]  0000000000000202 0000000000000002 0000000000000001 0000000000000000
[   32.946003]  0000000000000198 0000000000000002 0000000000000000 00000000002ca5b0
[   32.946003]  0000000000000000 ffff88003e5533e0 ffff88003f977c00 ffffffff802225e3

the ffffffff8 prefix (assuming we end up randomizing the address 
within the 2GB window available to a RIP-relative addressed kernel) 
would be easy to detect even if it's not word aligned. There *would* 
be false positives (a 32-bit value of -7 is common), but as long as 
we marked any unrandomization clearly with an asterix:

[   32.946003] Stack:
[   32.946003]  0000000000000202 0000000000000002 0000000000000001 0000000000000000
[   32.946003]  0000000000000198 0000000000000002 0000000000000000 00000000002ca5b0
[   32.946003]  0000000000000000*ffff88003e5533e0*ffff88003f977c00*ffffffff802225e3

we'd be informed that the stack content was slighly different. If we 
fixed up register values, say the raw value is:

[   32.946003] RDX: 0000000000000000 RSI: ffffffff80ce0100 RDI: 0000000000000000

and randomization is -0x100000 then we'd print the normalized value 
for 'RSI':

[   32.946003] RDX: 0000000000000000 RSI:*ffffffff80de0100 RDI: 0000000000000000

And the '*' tells us that this value got normalized.

On 32-bit systems the rate of false positive is probably higher, he 
'0xc0' byte pattern is pretty common.

Now, theoretically there's still a tiny information hole here: if an 
attacker can crash a kernel in a non-fatal way that puts some known 
data on the kernel stack, then the unrandomization will reveal the 
secret ...

I guess we'll have to live with that: really paranoid places will 
disable dmesg access to unprivileged users.

[ They might also want to have a knob to not log kernel crashes at 
  all - best protection is if *no one* (not even root) has a way to 
  figure out the secret. That needs to go hand in hand with forced 
  use of signed modules, sanitized /dev/mem, no root-controllable DMA 
  access to any device, no ioperm() and iopl(), etc. - so a very 
  locked down kernel that protects even root from being able to 
  execute kernel code. Such systems are still useful btw even if root 
  otherwise has access to all disks and has access to the kernel 
  image and can install its own image: a reboot will generally set 
  off an alarm. ]

> Thanks very much for the feedback.

Hey, thanks for taking up on implementing this rather non-trivial 
security feature!

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 23:06   ` H. Peter Anvin
@ 2011-05-25 14:03     ` Dan Rosenberg
  2011-05-25 14:14       ` Ingo Molnar
  2011-05-25 15:48       ` H. Peter Anvin
  0 siblings, 2 replies; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-25 14:03 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Ingo Molnar, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, torvalds, adobriyan, penberg, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, pageexec

On Tue, 2011-05-24 at 16:06 -0700, H. Peter Anvin wrote:
> On 05/24/2011 02:16 PM, Ingo Molnar wrote:
> > 
> > On some systems holes can be pretty low as well - you'd have to research e820 
> > maps submitted to lkml to see how common this is - but it's not terribly 
> > common.
> > 
> > Some really old systems might have a hole between 15MB-16MB - but that's not an 
> > issue if we load at 16 MB or higher.
> > 
> 
> It definitely happens, and not just at 15-16 MiB either.
> 
> Doing this without actually consulting the memory map is dangerous as
> hell; plus you have to verify that you're not clobbering anything else,
> like the command line, initramfs or the linked list of data.
> 

My current idea is to use int 0x15, eax = 0xe801 (which seems to be
nearly universally supported) and use bx/dx to determine the amount of
contiguous, usable memory above 16 MB, which seems to be exactly what we
want to know.  If the BIOS does not support this function I'll be sure
to catch that and skip the randomization.  Likewise, if the amount of
returned memory seems insufficient or otherwise confusing, I'll skip the
randomization.

Given this information, do you have a conservative guess for how close
to the top of available memory we can put the kernel?  As in, let's say
we have an XYZ MB chunk of contiguous, free memory, how should I
calculate the highest, safe place to put the kernel in that region?

I'm going to continue to enforce the requirement that 16 MB is the
lowest address we can safely load the kernel, and I'd still appreciate
any information on why 2/4 MB default alignment might cause problems.

Thanks,
Dan

> 	-hpa



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-25 14:03     ` Dan Rosenberg
@ 2011-05-25 14:14       ` Ingo Molnar
  2011-05-25 15:48       ` H. Peter Anvin
  1 sibling, 0 replies; 95+ messages in thread
From: Ingo Molnar @ 2011-05-25 14:14 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: H. Peter Anvin, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, torvalds, adobriyan, penberg, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, pageexec


* Dan Rosenberg <drosenberg@vsecurity.com> wrote:

> I'm going to continue to enforce the requirement that 16 MB is the 
> lowest address we can safely load the kernel, and I'd still 
> appreciate any information on why 2/4 MB default alignment might 
> cause problems.

The 16 MB limit is more about preserving 24-bit addressable memory 
than about safety: it is a useful resource to certain physical 
devices and we do not want to reduce that resource by ~12.5% by 
putting a ~2MB kernel image into it.

But yes, we want to load above 16 MB, if RAM size makes it possible.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-25 11:23     ` Ingo Molnar
@ 2011-05-25 14:20       ` Dan Rosenberg
  2011-05-25 14:29         ` Ingo Molnar
  0 siblings, 1 reply; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-25 14:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, pageexec

On Wed, 2011-05-25 at 13:23 +0200, Ingo Molnar wrote:
> * Dan Rosenberg <drosenberg@vsecurity.com> wrote:
> 
> > > No, the right solution is what i suggested a few mails ago: 
> > > /proc/kallsyms (and other RIP printing places) should report the 
> > > non-randomized RIP.
> > > 
> > > That way we do not have to change the kptr_restrict default and 
> > > tools will continue to work ...
> > 
> > Ok, I'll do it this way, and leave the kptr_restrict default to 0.  
> > But I still think having the dmesg_restrict default depend on 
> > randomization makes sense, since kernel .text is explicitly 
> > revealed in the syslog.
> 
> Hm, where is it revealed beyond intcall addresses, which ought to be 
> handled if they are printed via %pK?
> 
> All such information leaks need to be fixed. (This will be the 
> slowest part of the process i suspect - there's many channels.)
> 
> in the syslog we obviously want any RIPs converted to the canonical 
> 'unrandomized' address, so that it can be matched against 
> /proc/kallsyms, etc. Their randomized value isnt very useful. That 
> will also protect the randomization secret as a side effect.
> 

%pK doesn't seem like the right thing to do in many cases, since the
capability check doesn't have proper meaning if the caller isn't in
process context.  If I'm understanding you right (correct if I'm wrong),
you're looking for kptr_restrict to be completely separate from this
randomization, and when randomization is enabled, all pointers are
unconditionally de-randomized.  It seems like the right way to do this
is to include code in vsprintf.c for all %p-type specifiers that would
normally print the actual pointer (as opposed to some of the specialized
cases that print other data) that does something like this:

if((unsigned long)ptr >= (unsigned long)_stext &&
   (unsigned long)ptr <= (unsigned long)_end)
	ptr -= (_text - (CONFIG_PHYSICAL_START + PAGE_OFFSET));

This way, we don't have to go tracking down every printk caller and
convert them to %pK, which isn't usable anyway in some cases.

> The only thorny issue AFAICS are oopses. There's real value in having 
> 'raw' data from a crash (interpreting crashes is hard enough even 
> without randomization!), OTOH we could keep most of the value of them 
> by converting them back to canonical addresses.
> 
> This would be more or less easy to do for the RIP and the registers, 
> but less obvious for the stack: a kernel pointer can lie on the stack 
> at arbitrary alignment. On 64-bit we could probably detect them 
> rather reliably based on the randomized prefix of kernel addresses:
> 
> [   32.946003] Stack:
> [   32.946003]  0000000000000202 0000000000000002 0000000000000001 0000000000000000
> [   32.946003]  0000000000000198 0000000000000002 0000000000000000 00000000002ca5b0
> [   32.946003]  0000000000000000 ffff88003e5533e0 ffff88003f977c00 ffffffff802225e3
> 
> the ffffffff8 prefix (assuming we end up randomizing the address 
> within the 2GB window available to a RIP-relative addressed kernel) 
> would be easy to detect even if it's not word aligned. There *would* 
> be false positives (a 32-bit value of -7 is common), but as long as 
> we marked any unrandomization clearly with an asterix:
> 
> [   32.946003] Stack:
> [   32.946003]  0000000000000202 0000000000000002 0000000000000001 0000000000000000
> [   32.946003]  0000000000000198 0000000000000002 0000000000000000 00000000002ca5b0
> [   32.946003]  0000000000000000*ffff88003e5533e0*ffff88003f977c00*ffffffff802225e3
> 
> we'd be informed that the stack content was slighly different. If we 
> fixed up register values, say the raw value is:
> 
> [   32.946003] RDX: 0000000000000000 RSI: ffffffff80ce0100 RDI: 0000000000000000
> 
> and randomization is -0x100000 then we'd print the normalized value 
> for 'RSI':
> 
> [   32.946003] RDX: 0000000000000000 RSI:*ffffffff80de0100 RDI: 0000000000000000
> 
> And the '*' tells us that this value got normalized.
> 
> On 32-bit systems the rate of false positive is probably higher, he 
> '0xc0' byte pattern is pretty common.
> 
> Now, theoretically there's still a tiny information hole here: if an 
> attacker can crash a kernel in a non-fatal way that puts some known 
> data on the kernel stack, then the unrandomization will reveal the 
> secret ...
> 
> I guess we'll have to live with that: really paranoid places will 
> disable dmesg access to unprivileged users.

I'm tempted to just say "leave OOPS alone", and if you want to preserve
secrecy past an OOPS, you should be disabling dmesg access anyway.  But
I'll think more about this.

> 
> [ They might also want to have a knob to not log kernel crashes at 
>   all - best protection is if *no one* (not even root) has a way to 
>   figure out the secret. That needs to go hand in hand with forced 
>   use of signed modules, sanitized /dev/mem, no root-controllable DMA 
>   access to any device, no ioperm() and iopl(), etc. - so a very 
>   locked down kernel that protects even root from being able to 
>   execute kernel code. Such systems are still useful btw even if root 
>   otherwise has access to all disks and has access to the kernel 
>   image and can install its own image: a reboot will generally set 
>   off an alarm. ]
> 
> > Thanks very much for the feedback.
> 
> Hey, thanks for taking up on implementing this rather non-trivial 
> security feature!
> 

What can I say, I like a challenge. :)

-Dan

> Thanks,
> 
> 	Ingo



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-25 14:20       ` Dan Rosenberg
@ 2011-05-25 14:29         ` Ingo Molnar
  0 siblings, 0 replies; 95+ messages in thread
From: Ingo Molnar @ 2011-05-25 14:29 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, pageexec


* Dan Rosenberg <drosenberg@vsecurity.com> wrote:

> On Wed, 2011-05-25 at 13:23 +0200, Ingo Molnar wrote:
> > * Dan Rosenberg <drosenberg@vsecurity.com> wrote:
> > 
> > > > No, the right solution is what i suggested a few mails ago: 
> > > > /proc/kallsyms (and other RIP printing places) should report the 
> > > > non-randomized RIP.
> > > > 
> > > > That way we do not have to change the kptr_restrict default and 
> > > > tools will continue to work ...
> > > 
> > > Ok, I'll do it this way, and leave the kptr_restrict default to 0.  
> > > But I still think having the dmesg_restrict default depend on 
> > > randomization makes sense, since kernel .text is explicitly 
> > > revealed in the syslog.
> > 
> > Hm, where is it revealed beyond intcall addresses, which ought to be 
> > handled if they are printed via %pK?
> > 
> > All such information leaks need to be fixed. (This will be the 
> > slowest part of the process i suspect - there's many channels.)
> > 
> > in the syslog we obviously want any RIPs converted to the canonical 
> > 'unrandomized' address, so that it can be matched against 
> > /proc/kallsyms, etc. Their randomized value isnt very useful. That 
> > will also protect the randomization secret as a side effect.
> > 
> 
> %pK doesn't seem like the right thing to do in many cases, since 
> the capability check doesn't have proper meaning if the caller 
> isn't in process context. [...]

Oh, ok, i see what you mean.

I was not thinking of %pK as a way to restrict access really. I am 
thinking of it as a nicely central way to create constant RIPs out of 
random RIPs.

In that sense if %pK cannot be called everywhere please introduce a 
%pk variant that just prints a raw kernel address value and does no 
access check, just the unrandomization.

> [...]  If I'm understanding you right (correct if I'm wrong), 
> you're looking for kptr_restrict to be completely separate from 
> this randomization, and when randomization is enabled, all pointers 
> are unconditionally de-randomized.  It seems like the right way to 
> do this is to include code in vsprintf.c for all %p-type specifiers 
> that would normally print the actual pointer (as opposed to some of 
> the specialized cases that print other data) that does something 
> like this:
> 
> if((unsigned long)ptr >= (unsigned long)_stext &&
>    (unsigned long)ptr <= (unsigned long)_end)
> 	ptr -= (_text - (CONFIG_PHYSICAL_START + PAGE_OFFSET));
> 
> This way, we don't have to go tracking down every printk caller and 
> convert them to %pK, which isn't usable anyway in some cases.

Yeah, but please also provide %pk to not have to hunt down every 
single place that might print a kernel address via a "%016Lx" or "%p" 
and thus leaks the randomization secret.

That way you can convert *every* known kernel-address-printing format 
string to one of the %p variants and thus have the above 
unrandomization step done automatically.

Perhaps as a debugging help also try to flag %p printouts that are 
suspiciously within kernel image boundaries. (Note: you dont want to 
printk from that place though, as you could already be executing 
within printk.) Maybe even %x/%X printouts that are in that range. As 
a debugging help, there could easily be false positives.

> I'm tempted to just say "leave OOPS alone", and if you want to preserve
> secrecy past an OOPS, you should be disabling dmesg access anyway.  But
> I'll think more about this.

It's definitely a good first-approximation answer. We only do perfect 
kernels anyway, so they wont oops.

Please convert RIPs in oops decoding nevertheless, so that it can be 
correlated with the symbol table...

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-25 14:03     ` Dan Rosenberg
  2011-05-25 14:14       ` Ingo Molnar
@ 2011-05-25 15:48       ` H. Peter Anvin
  2011-05-25 16:15         ` Dan Rosenberg
  1 sibling, 1 reply; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-25 15:48 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Ingo Molnar, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, torvalds, adobriyan, penberg, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, pageexec

On 05/25/2011 07:03 AM, Dan Rosenberg wrote:
> 
> My current idea is to use int 0x15, eax = 0xe801 (which seems to be
> nearly universally supported) and use bx/dx to determine the amount of
> contiguous, usable memory above 16 MB, which seems to be exactly what we
> want to know.  If the BIOS does not support this function I'll be sure
> to catch that and skip the randomization.  Likewise, if the amount of
> returned memory seems insufficient or otherwise confusing, I'll skip the
> randomization.
> 

No, sorry.  This has been wrong for over 10 years; there is no
substitute for the full (e820) memory map.  *Furthermore*, based on
where in the bootup sequence you are doing this, you also have to
consider any other memory structures that the kernel needs to be aware
of (initramfs, any chunks in the linked list, the command line, EFI
handover structures, etc.)  This is in fact an arbitrarily complex
operation... we have *finally* gotten the kernel to the point where (a)
the boot loader can actually do the right thing in all cases and (b) the
kernel will reserve or copy all the auxiliary memory chunks it needs at
a very early point.

Sorry, this cannot be short-circuited.

> Given this information, do you have a conservative guess for how close
> to the top of available memory we can put the kernel?  As in, let's say
> we have an XYZ MB chunk of contiguous, free memory, how should I
> calculate the highest, safe place to put the kernel in that region?
> 
> I'm going to continue to enforce the requirement that 16 MB is the
> lowest address we can safely load the kernel, and I'd still appreciate
> any information on why 2/4 MB default alignment might cause problems.

The problem with all of that was backwards compatibility with existing
relocating bootloaders.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-25 15:48       ` H. Peter Anvin
@ 2011-05-25 16:15         ` Dan Rosenberg
  2011-05-25 16:24           ` H. Peter Anvin
  0 siblings, 1 reply; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-25 16:15 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Ingo Molnar, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, torvalds, adobriyan, penberg, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, pageexec

On Wed, 2011-05-25 at 08:48 -0700, H. Peter Anvin wrote:
> On 05/25/2011 07:03 AM, Dan Rosenberg wrote:
> > 
> > My current idea is to use int 0x15, eax = 0xe801 (which seems to be
> > nearly universally supported) and use bx/dx to determine the amount of
> > contiguous, usable memory above 16 MB, which seems to be exactly what we
> > want to know.  If the BIOS does not support this function I'll be sure
> > to catch that and skip the randomization.  Likewise, if the amount of
> > returned memory seems insufficient or otherwise confusing, I'll skip the
> > randomization.
> > 
> 
> No, sorry.  This has been wrong for over 10 years; there is no
> substitute for the full (e820) memory map.  *Furthermore*, based on
> where in the bootup sequence you are doing this, you also have to
> consider any other memory structures that the kernel needs to be aware
> of (initramfs, any chunks in the linked list, the command line, EFI
> handover structures, etc.)  This is in fact an arbitrarily complex
> operation... we have *finally* gotten the kernel to the point where (a)
> the boot loader can actually do the right thing in all cases and (b) the
> kernel will reserve or copy all the auxiliary memory chunks it needs at
> a very early point.
> 
> Sorry, this cannot be short-circuited.
> 

Ok, checking the e820 memory map seems like the way to go then.  As a
first attempt, I'd assume that if I find a contiguous free chunk that
begins before (or at) 16 MB and continues beyond 16 MB, then that
represents space where it's safe to load the kernel (up to a certain
point before the end of that chunk), assuming the chunk has enough space
and I do some degree of checking that I'm not decompressing on top of
something else (I'll start to gather a list of what to watch out for).
Is this a fair assumption?

> > Given this information, do you have a conservative guess for how close
> > to the top of available memory we can put the kernel?  As in, let's say
> > we have an XYZ MB chunk of contiguous, free memory, how should I
> > calculate the highest, safe place to put the kernel in that region?
> > 
> > I'm going to continue to enforce the requirement that 16 MB is the
> > lowest address we can safely load the kernel, and I'd still appreciate
> > any information on why 2/4 MB default alignment might cause problems.
> 
> The problem with all of that was backwards compatibility with existing
> relocating bootloaders.
> 

Do you have any alternatives that allow maintaining compatibility while
giving us finer-grained alignment?  It seems it should be possible,
since alignment was lower than 16 MB for years before this change was
introduced...

Thanks,
Dan

> 	-hpa
> 
> -- 
> H. Peter Anvin, Intel Open Source Technology Center
> I work for Intel.  I don't speak on their behalf.



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-25 16:15         ` Dan Rosenberg
@ 2011-05-25 16:24           ` H. Peter Anvin
  0 siblings, 0 replies; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-25 16:24 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Ingo Molnar, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, torvalds, adobriyan, penberg, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, pageexec

On 05/25/2011 09:15 AM, Dan Rosenberg wrote:
> 
> Ok, checking the e820 memory map seems like the way to go then.  As a
> first attempt, I'd assume that if I find a contiguous free chunk that
> begins before (or at) 16 MB and continues beyond 16 MB, then that
> represents space where it's safe to load the kernel (up to a certain
> point before the end of that chunk), assuming the chunk has enough space
> and I do some degree of checking that I'm not decompressing on top of
> something else (I'll start to gather a list of what to watch out for).
> Is this a fair assumption?
> 

There is already code that calculates exactly how much space is needed,
so that part is good -- you should have a tight bound available to you.

The important and messy part, though, is that you get the "raw" e820 map
at that point (including not even having had the e801 and 88 fallback
information merged into it.)  This information has to be sanitized (to
deal with overlaps and broken-up chunks) and reserved areas merged in.
This is done in the kernel proper, and bootloaders have some equivalent
code, but you don't have it in that particular boot stage.

> 
> Do you have any alternatives that allow maintaining compatibility while
> giving us finer-grained alignment?  It seems it should be possible,
> since alignment was lower than 16 MB for years before this change was
> introduced...
> 

Basically, you end up having to have a "real alignment" that is internal
to the kernel.  We already expose a "minimum alignment" field in the
header (the legacy field is now "recommended alignment"); however, the
"minimum alignment" is really too aggressive.

Since this can be buried in the kernel itself the key is to not change
the existing header fields.

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 20:31 [RFC][PATCH] Randomize kernel base address on boot Dan Rosenberg
                   ` (4 preceding siblings ...)
  2011-05-24 23:08 ` Dan Rosenberg
@ 2011-05-26 20:01 ` Vivek Goyal
  2011-05-26 20:06   ` Dan Rosenberg
  2011-05-26 20:16   ` Valdis.Kletnieks
  2011-05-26 20:35 ` Vivek Goyal
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 95+ messages in thread
From: Vivek Goyal @ 2011-05-26 20:01 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Dan Rosenberg, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, Ingo Molnar, pageexec

On Tue, May 24, 2011 at 04:31:45PM -0400, Dan Rosenberg wrote:

[..]
>  ==============================================================
>  
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 880fcb6..999ea82 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -1548,8 +1548,8 @@ config PHYSICAL_START
>  	  If kernel is a not relocatable (CONFIG_RELOCATABLE=n) then
>  	  bzImage will decompress itself to above physical address and
>  	  run from there. Otherwise, bzImage will run from the address where
> -	  it has been loaded by the boot loader and will ignore above physical
> -	  address.
> +	  it has been loaded by the boot loader, using the above physical
> +	  address as a lower bound.
>  
>  	  In normal kdump cases one does not have to set/change this option
>  	  as now bzImage can be compiled as a completely relocatable image
> @@ -1595,7 +1595,31 @@ config RELOCATABLE
>  
>  	  Note: If CONFIG_RELOCATABLE=y, then the kernel runs from the address
>  	  it has been loaded at and the compile time physical address
> -	  (CONFIG_PHYSICAL_START) is ignored.
> +	  (CONFIG_PHYSICAL_START) is solely used as a lower bound.
> +

This does not sound too good. Overloading the definition of PHYSICAL_START
with minimum address. The very definition of relocatable kernel is that
it should be able to run from the physical address it has been loaded
at (subjected to alignment constraints).

So I don't think overloading CONFIG_PHYSICAL_START definition is a good
idea. In fact there is no reason that why kdump kernels should not run
and boot below 16MB. So limiting those kernels to not load and run
below 16MB is does not sound like good option to me.

Also randomization of kernel load address at run time will probably have
some issues with crashkernel=X@Y address syntax. So far user knew what
address first kernel is booting from and user could speicy where to 
reserve memory. Now it might happen that user specified some memory
to reserve and kernel decided to occupy that space resulting in failed
memory reservation for crash kernel.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-26 20:01 ` Vivek Goyal
@ 2011-05-26 20:06   ` Dan Rosenberg
  2011-05-26 20:16   ` Valdis.Kletnieks
  1 sibling, 0 replies; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-26 20:06 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, Ingo Molnar, pageexec

On Thu, 2011-05-26 at 16:01 -0400, Vivek Goyal wrote:
> On Tue, May 24, 2011 at 04:31:45PM -0400, Dan Rosenberg wrote:
> 
> [..]
> >  ==============================================================
> >  
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index 880fcb6..999ea82 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -1548,8 +1548,8 @@ config PHYSICAL_START
> >  	  If kernel is a not relocatable (CONFIG_RELOCATABLE=n) then
> >  	  bzImage will decompress itself to above physical address and
> >  	  run from there. Otherwise, bzImage will run from the address where
> > -	  it has been loaded by the boot loader and will ignore above physical
> > -	  address.
> > +	  it has been loaded by the boot loader, using the above physical
> > +	  address as a lower bound.
> >  
> >  	  In normal kdump cases one does not have to set/change this option
> >  	  as now bzImage can be compiled as a completely relocatable image
> > @@ -1595,7 +1595,31 @@ config RELOCATABLE
> >  
> >  	  Note: If CONFIG_RELOCATABLE=y, then the kernel runs from the address
> >  	  it has been loaded at and the compile time physical address
> > -	  (CONFIG_PHYSICAL_START) is ignored.
> > +	  (CONFIG_PHYSICAL_START) is solely used as a lower bound.
> > +
> 
> This does not sound too good. Overloading the definition of PHYSICAL_START
> with minimum address. The very definition of relocatable kernel is that
> it should be able to run from the physical address it has been loaded
> at (subjected to alignment constraints).
> 
> So I don't think overloading CONFIG_PHYSICAL_START definition is a good
> idea. In fact there is no reason that why kdump kernels should not run
> and boot below 16MB. So limiting those kernels to not load and run
> below 16MB is does not sound like good option to me.
> 

I'm going to revisit this part of the patch and think of a better way to
do this.

> Also randomization of kernel load address at run time will probably have
> some issues with crashkernel=X@Y address syntax. So far user knew what
> address first kernel is booting from and user could speicy where to 
> reserve memory. Now it might happen that user specified some memory
> to reserve and kernel decided to occupy that space resulting in failed
> memory reservation for crash kernel.
> 

Ok, added to the list of things to figure out.  Thanks.

> Thanks
> Vivek



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-26 20:01 ` Vivek Goyal
  2011-05-26 20:06   ` Dan Rosenberg
@ 2011-05-26 20:16   ` Valdis.Kletnieks
  2011-05-26 20:31     ` Vivek Goyal
  1 sibling, 1 reply; 95+ messages in thread
From: Valdis.Kletnieks @ 2011-05-26 20:16 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Dan Rosenberg, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Ingo Molnar, pageexec

[-- Attachment #1: Type: text/plain, Size: 755 bytes --]

On Thu, 26 May 2011 16:01:21 EDT, Vivek Goyal said:

> Also randomization of kernel load address at run time will probably have
> some issues with crashkernel=X@Y address syntax. So far user knew what
> address first kernel is booting from and user could speicy where to 
> reserve memory. Now it might happen that user specified some memory
> to reserve and kernel decided to occupy that space resulting in failed
> memory reservation for crash kernel.

That is however fixable - the randomizer just needs to make sure it doesn't
overlay the crashkernel= space, and the crashkernel needs to be started with a
'norandomize' parameter.  If your threat model includes attacks on the
crashkernel that randomizing will help with, you got bigger problems. ;)


[-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-26 20:16   ` Valdis.Kletnieks
@ 2011-05-26 20:31     ` Vivek Goyal
  2011-05-27  9:36       ` Ingo Molnar
  0 siblings, 1 reply; 95+ messages in thread
From: Vivek Goyal @ 2011-05-26 20:31 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Dan Rosenberg, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Ingo Molnar, pageexec

On Thu, May 26, 2011 at 04:16:05PM -0400, Valdis.Kletnieks@vt.edu wrote:
> On Thu, 26 May 2011 16:01:21 EDT, Vivek Goyal said:
> 
> > Also randomization of kernel load address at run time will probably have
> > some issues with crashkernel=X@Y address syntax. So far user knew what
> > address first kernel is booting from and user could speicy where to 
> > reserve memory. Now it might happen that user specified some memory
> > to reserve and kernel decided to occupy that space resulting in failed
> > memory reservation for crash kernel.
> 
> That is however fixable - the randomizer just needs to make sure it doesn't
> overlay the crashkernel= space, and the crashkernel needs to be started with a
> 'norandomize' parameter.

That can be done but at the same time if kernel does not find any suitable
range to boot from, it should override crashkernel=X@Y settings and fail
crash memory reservation.

I guess with randomize space thing a more suitable crash kernel command
line will be crashkernel=X where kernel decides the base address for
second kernel depending on availability.

> If your threat model includes attacks on the
> crashkernel that randomizing will help with, you got bigger problems. ;)
> 

:-) I think norandomize for kdump kernel should be just fine.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 20:31 [RFC][PATCH] Randomize kernel base address on boot Dan Rosenberg
                   ` (5 preceding siblings ...)
  2011-05-26 20:01 ` Vivek Goyal
@ 2011-05-26 20:35 ` Vivek Goyal
  2011-05-26 20:40   ` Vivek Goyal
  2011-05-26 20:39 ` Dan Rosenberg
  2011-05-26 22:18 ` Rafael J. Wysocki
  8 siblings, 1 reply; 95+ messages in thread
From: Vivek Goyal @ 2011-05-26 20:35 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Dan Rosenberg, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, Ingo Molnar, pageexec

On Tue, May 24, 2011 at 04:31:45PM -0400, Dan Rosenberg wrote:
> This introduces CONFIG_RANDOMIZE_BASE, which randomizes the address at
> which the kernel is decompressed at boot as a security feature that
> deters exploit attempts relying on knowledge of the location of kernel
> internals.  The default values of the kptr_restrict and dmesg_restrict
> sysctls are set to (1) when this is enabled, since hiding kernel
> pointers is necessary to preserve the secrecy of the randomized base
> address.

What happens to /proc/iomem interface which gives us the physical memory
location where kernel is loaded. kexec-tools relies on that interface
heavily so we can not take it away. And if we can not take it away then
I think somebody should be easibly be able to calculate this randomized
base address.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 20:31 [RFC][PATCH] Randomize kernel base address on boot Dan Rosenberg
                   ` (6 preceding siblings ...)
  2011-05-26 20:35 ` Vivek Goyal
@ 2011-05-26 20:39 ` Dan Rosenberg
  2011-05-27  7:15   ` Ingo Molnar
  2011-05-31 16:52   ` Matthew Garrett
  2011-05-26 22:18 ` Rafael J. Wysocki
  8 siblings, 2 replies; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-26 20:39 UTC (permalink / raw)
  To: Tony Luck, linux-kernel, kees.cook, davej, torvalds, adobriyan,
	eranian, penberg, davem, Arjan van de Ven, hpa, Valdis.Kletnieks,
	Andrew Morton, pageexec, Ingo Molnar, Vivek Goyal

On Tue, 2011-05-24 at 16:31 -0400, Dan Rosenberg wrote:
> This introduces CONFIG_RANDOMIZE_BASE, which randomizes the address at
> which the kernel is decompressed at boot as a security feature that
> deters exploit attempts relying on knowledge of the location of kernel
> internals.  The default values of the kptr_restrict and dmesg_restrict
> sysctls are set to (1) when this is enabled, since hiding kernel
> pointers is necessary to preserve the secrecy of the randomized base
> address.
> 
> This feature also uses a fixed mapping to move the IDT (if not already
> done as a fix for the F00F bug), to avoid exposing the location of
> kernel internals relative to the original IDT.  This has the additional
> security benefit of marking the new virtual address of the IDT
> read-only.
> 
> Entropy is generated using the RDRAND instruction if it is supported. If
> not, then RDTSC is used, if supported. If neither RDRAND nor RDTSC are
> supported, then no randomness is introduced. Support for the CPUID
> instruction is required to check for the availability of these two
> instructions.
> 
> Thanks to everyone who contributed helpful suggestions and feedback so
> far.
> 

I wanted to send out an update email that consolidated the feedback and
suggestions I've received so far:

1. I'm nearly finished a first draft of code to parse the BIOS E820
memory map to determine where it's safe to place the randomized kernel.
This code accounts for overlapping regions, as well as potential
conflicts in region types (free vs. reserved, etc.), in favor of
non-free types.  The end result is, I'll have a reasonable upper bound.

2. I'll parse the kernel command line for crashkernel arguments and
avoid placing a randomized kernel in any regions marked as reserved.  A
new command line argument for kdump might be a good idea as well
(discussion on-going).

3. I'll be introducing a new format specifier (perhaps %pk) that
unconditionally de-randomizes kernel pointers, and switch callers where
appropriate.

4. The perf call chains that rely on kernel pointers will account for
the randomization.

5. I'll be switching to per-cpu IDTs, basing my work on the following
patch:

http://marc.info/?l=linux-kernel&m=112767117501231&w=2

Any review or comments on the above patch would be helpful.  I'm
considering submitting this portion separately, as it may provide
performance and scalability benefits regardless of randomization.

6. As per H. Peter Anvin's suggestion, it seems there's some demand for
a way to opt-out at the boot-loader level, possibly via a command-line
option and boot protocol flag.

7. Still need to figure out exactly what's ok and what's not regarding
altering alignment and PHYSICAL_START.  It seems there's some consensus
on "don't do it", but perhaps it's ok to partially ignore the alignment
config at runtime in favor of hard-coded, known-safe, finer-grained
alternatives.

8. Other pieces of feedback, such as comment suggestions, changes to
kptr_restrict/dmesg_restrict defaults, etc. have been incorporated.

8. x86-64 will present its own set of challenges.  One thing at a time.

Thanks for all the feedback and guidance so far.  Let me know if
anything above is objectionable, or if you have any more suggestions.
There's lots to do, but I haven't given up yet. :)

Regards,
Dan


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-26 20:35 ` Vivek Goyal
@ 2011-05-26 20:40   ` Vivek Goyal
  2011-05-26 20:44     ` Dan Rosenberg
  0 siblings, 1 reply; 95+ messages in thread
From: Vivek Goyal @ 2011-05-26 20:40 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Dan Rosenberg, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, Ingo Molnar, pageexec

On Thu, May 26, 2011 at 04:35:02PM -0400, Vivek Goyal wrote:
> On Tue, May 24, 2011 at 04:31:45PM -0400, Dan Rosenberg wrote:
> > This introduces CONFIG_RANDOMIZE_BASE, which randomizes the address at
> > which the kernel is decompressed at boot as a security feature that
> > deters exploit attempts relying on knowledge of the location of kernel
> > internals.  The default values of the kptr_restrict and dmesg_restrict
> > sysctls are set to (1) when this is enabled, since hiding kernel
> > pointers is necessary to preserve the secrecy of the randomized base
> > address.
> 
> What happens to /proc/iomem interface which gives us the physical memory
> location where kernel is loaded. kexec-tools relies on that interface
> heavily so we can not take it away. And if we can not take it away then
> I think somebody should be easibly be able to calculate this randomized
> base address.

Resending this mail as in last message I got the email address of Dan
wrong and mail bounced. Sorry about that.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-26 20:40   ` Vivek Goyal
@ 2011-05-26 20:44     ` Dan Rosenberg
  2011-05-26 20:55       ` Vivek Goyal
  2011-05-27 13:13       ` Vivek Goyal
  0 siblings, 2 replies; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-26 20:44 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, Ingo Molnar, pageexec

On Thu, 2011-05-26 at 16:40 -0400, Vivek Goyal wrote:
> On Thu, May 26, 2011 at 04:35:02PM -0400, Vivek Goyal wrote:
> > On Tue, May 24, 2011 at 04:31:45PM -0400, Dan Rosenberg wrote:
> > > This introduces CONFIG_RANDOMIZE_BASE, which randomizes the address at
> > > which the kernel is decompressed at boot as a security feature that
> > > deters exploit attempts relying on knowledge of the location of kernel
> > > internals.  The default values of the kptr_restrict and dmesg_restrict
> > > sysctls are set to (1) when this is enabled, since hiding kernel
> > > pointers is necessary to preserve the secrecy of the randomized base
> > > address.
> > 
> > What happens to /proc/iomem interface which gives us the physical memory
> > location where kernel is loaded. kexec-tools relies on that interface
> > heavily so we can not take it away. And if we can not take it away then
> > I think somebody should be easibly be able to calculate this randomized
> > base address.

Is it common to run kexec-tools as non-root?  It may be necessary to
restrict this interface to root when randomization is used (keep in mind
nobody's going to force you to turn this on by default, at least for the
foreseeable future).

-Dan

> 
> Thanks
> Vivek



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-26 20:44     ` Dan Rosenberg
@ 2011-05-26 20:55       ` Vivek Goyal
  2011-05-27  9:38         ` Ingo Molnar
  2011-05-27 13:13       ` Vivek Goyal
  1 sibling, 1 reply; 95+ messages in thread
From: Vivek Goyal @ 2011-05-26 20:55 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, Ingo Molnar, pageexec

On Thu, May 26, 2011 at 04:44:34PM -0400, Dan Rosenberg wrote:
> On Thu, 2011-05-26 at 16:40 -0400, Vivek Goyal wrote:
> > On Thu, May 26, 2011 at 04:35:02PM -0400, Vivek Goyal wrote:
> > > On Tue, May 24, 2011 at 04:31:45PM -0400, Dan Rosenberg wrote:
> > > > This introduces CONFIG_RANDOMIZE_BASE, which randomizes the address at
> > > > which the kernel is decompressed at boot as a security feature that
> > > > deters exploit attempts relying on knowledge of the location of kernel
> > > > internals.  The default values of the kptr_restrict and dmesg_restrict
> > > > sysctls are set to (1) when this is enabled, since hiding kernel
> > > > pointers is necessary to preserve the secrecy of the randomized base
> > > > address.
> > > 
> > > What happens to /proc/iomem interface which gives us the physical memory
> > > location where kernel is loaded. kexec-tools relies on that interface
> > > heavily so we can not take it away. And if we can not take it away then
> > > I think somebody should be easibly be able to calculate this randomized
> > > base address.
> 
> Is it common to run kexec-tools as non-root?  It may be necessary to
> restrict this interface to root when randomization is used (keep in mind
> nobody's going to force you to turn this on by default, at least for the
> foreseeable future).

kexec-tools runs as root. And I see that /proc/iomem permissions are
also for root only. So it probably is a non-issue.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-24 20:31 [RFC][PATCH] Randomize kernel base address on boot Dan Rosenberg
                   ` (7 preceding siblings ...)
  2011-05-26 20:39 ` Dan Rosenberg
@ 2011-05-26 22:18 ` Rafael J. Wysocki
  2011-05-26 22:32   ` H. Peter Anvin
  2011-05-27 15:42   ` Linus Torvalds
  8 siblings, 2 replies; 95+ messages in thread
From: Rafael J. Wysocki @ 2011-05-26 22:18 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, Ingo Molnar, pageexec

On Tuesday, May 24, 2011, Dan Rosenberg wrote:
> This introduces CONFIG_RANDOMIZE_BASE, which randomizes the address at
> which the kernel is decompressed at boot as a security feature that
> deters exploit attempts relying on knowledge of the location of kernel
> internals.  The default values of the kptr_restrict and dmesg_restrict
> sysctls are set to (1) when this is enabled, since hiding kernel
> pointers is necessary to preserve the secrecy of the randomized base
> address.
> 
> This feature also uses a fixed mapping to move the IDT (if not already
> done as a fix for the F00F bug), to avoid exposing the location of
> kernel internals relative to the original IDT.  This has the additional
> security benefit of marking the new virtual address of the IDT
> read-only.
> 
> Entropy is generated using the RDRAND instruction if it is supported. If
> not, then RDTSC is used, if supported. If neither RDRAND nor RDTSC are
> supported, then no randomness is introduced. Support for the CPUID
> instruction is required to check for the availability of these two
> instructions.
> 
> Thanks to everyone who contributed helpful suggestions and feedback so
> far.
> 
> Comments/Questions:
> 
> * Since RDRAND is relatively new, only the most recent version of
> binutils supports assembling it.  To avoid breaking builds for people
> who use older toolchains but want this feature, I hardcoded the opcodes.
> If anyone has a better approach, please let me know.
> 
> * I chose to mimic the F00F bugfix behavior for moving the IDT, since it
> required very little code and has the additional benefit of making the
> IDT read-only. Ingo Molnar's suggestion of allocating per-cpu IDTs
> instead is still on the table, and I'd like to get feedback on this.
> 
> * In order to increase the entropy for the randomized base, I changed
> the default value of CONFIG_PHYSICAL_ALIGN back to 2mb.  It had
> previously been raised to 16mb as a hack so that relocatable kernels
> wouldn't load below that minimum.  I address this by changing the
> meaning of CONFIG_PHYSICAL_START such that it now represents a minimum
> address that relocatable kernels can be loaded at (rather than being
> ignored by relocatable kernels).  So, if a relocatable kernel determines
> it should be loaded at an address below CONFIG_PHYSICAL_START (which
> defaults to 16mb), I just bump it up.
> 
> * I would appreciate guidance on safe values for the highest addresses
> we can safely load the kernel at, on both 32-bit and 64-bit. This
> version uses 64mb (0x4000000) for 32-bit, and worked well in testing.
> 
> * CONFIG_RANDOMIZE_BASE automatically sets the default value of
> kptr_restrict and dmesg_restrict to 1, since it's nonsensical to use
> this without the other two.  I considered removing
> CONFIG_SECURITY_DMESG_RESTRICT altogether (it currently sets the default
> value for dmesg_restrict), but just in case distros want to keep the
> CONFIG as a toggle switch but don't want to use CONFIG_RANDOMIZE_BASE, I
> kept it around.  So, now CONFIG_RANDOMIZE_BASE sets the default value
> for CONFIG_SECURITY_DMESG_RESTRICT.
> 
> * x86-64 is still "to-do". Because it calculates the kernel text address
> twice, this may be a little trickier.
> 
> * Finding a middle ground instead of the current "all-or-nothing"
> behavior of kptr_restrict that allows perf users to use this feature is
> future work.
> 
> * Tested by repeatedly booting and observing kallsyms output on both
> i386.  Passed the "looks random to me" test, and saw no bad behavior.
> Tested that changing CONFIG_PHYSICAL_ALIGN to 2mb still boots and runs
> fine on amd64.
> 
> * Is it worth bothering to look for alternate sources of entropy if
> RDTSC isn't available?
> 
> * Could use testing of CPU hotplugging and suspend/resume.

Well, as far as I can tell, this feature is going to break hibernation on
both x86_32 and x86_64 at the moment, unless you can guarantee that the
randomized kernel location will be the same for both the boot and the target
kernels.

It may be worked around on x86_64 relatively easily, I think, but other
architectures (including the 32-bit x86) would require much more intrusive
modifications to work with that feature.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-26 22:18 ` Rafael J. Wysocki
@ 2011-05-26 22:32   ` H. Peter Anvin
  2011-05-27  0:26     ` Dan Rosenberg
                       ` (2 more replies)
  2011-05-27 15:42   ` Linus Torvalds
  1 sibling, 3 replies; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-26 22:32 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Dan Rosenberg, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, torvalds, adobriyan, penberg, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, Ingo Molnar, pageexec

On 05/26/2011 03:18 PM, Rafael J. Wysocki wrote:
> 
> Well, as far as I can tell, this feature is going to break hibernation on
> both x86_32 and x86_64 at the moment, unless you can guarantee that the
> randomized kernel location will be the same for both the boot and the target
> kernels.
> 

Obviously we can't and we don't.  I'm a bit surprised at that
constraint... how can that constraint not break things like kernels of
slightly different size?

	-hpa

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-26 22:32   ` H. Peter Anvin
@ 2011-05-27  0:26     ` Dan Rosenberg
  2011-05-27 16:21       ` Rafael J. Wysocki
  2011-05-27  2:45     ` Dave Jones
  2011-05-27 16:07     ` Rafael J. Wysocki
  2 siblings, 1 reply; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-27  0:26 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Rafael J. Wysocki, Tony Luck, linux-kernel, davej, kees.cook,
	davem, eranian, torvalds, adobriyan, penberg, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, Ingo Molnar, pageexec

On Thu, 2011-05-26 at 15:32 -0700, H. Peter Anvin wrote:
> On 05/26/2011 03:18 PM, Rafael J. Wysocki wrote:
> > 
> > Well, as far as I can tell, this feature is going to break hibernation on
> > both x86_32 and x86_64 at the moment, unless you can guarantee that the
> > randomized kernel location will be the same for both the boot and the target
> > kernels.
> > 
> 
> Obviously we can't and we don't.  I'm a bit surprised at that
> constraint... how can that constraint not break things like kernels of
> slightly different size?
> 
> 	-hpa

Am I understanding it correctly that hibernation is currently operating
under a possibly false assumption?  If it's the case that hibernation
should be saving the physical address at which the kernel was previously
loaded and restoring it there regardless of randomization, it would
certainly help me out if someone familiar with the code could take a
stab at that.

Otherwise, any thoughts on a potential solution?

Thanks,
Dan


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-26 22:32   ` H. Peter Anvin
  2011-05-27  0:26     ` Dan Rosenberg
@ 2011-05-27  2:45     ` Dave Jones
  2011-05-27  9:40       ` Ingo Molnar
  2011-05-27 16:07     ` Rafael J. Wysocki
  2 siblings, 1 reply; 95+ messages in thread
From: Dave Jones @ 2011-05-27  2:45 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Rafael J. Wysocki, Dan Rosenberg, Tony Luck, linux-kernel,
	kees.cook, davem, eranian, torvalds, adobriyan, penberg,
	Arjan van de Ven, Andrew Morton, Valdis.Kletnieks, Ingo Molnar,
	pageexec

On Thu, May 26, 2011 at 03:32:13PM -0700, H. Peter Anvin wrote:
 > On 05/26/2011 03:18 PM, Rafael J. Wysocki wrote:
 > > 
 > > Well, as far as I can tell, this feature is going to break hibernation on
 > > both x86_32 and x86_64 at the moment, unless you can guarantee that the
 > > randomized kernel location will be the same for both the boot and the target
 > > kernels.
 > > 
 > 
 > Obviously we can't and we don't.  I'm a bit surprised at that
 > constraint... how can that constraint not break things like kernels of
 > slightly different size?

In Fedora at least, we make sure the kernel you thaw from is the same one
you booted by diddling with grub to force the right kernel to be booted.
By default, you won't see a bootmenu, so it'll just dtrt.  You can still
interrupt the boot process, force a boot menu and pick another kernel
of course, and we used to at least have safeguards in place that would
refuse to thaw an image from a different kernel. (This may or may not
be still true since we rewrote the initramfs tools)

	Dave



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-26 20:39 ` Dan Rosenberg
@ 2011-05-27  7:15   ` Ingo Molnar
  2011-05-31 16:52   ` Matthew Garrett
  1 sibling, 0 replies; 95+ messages in thread
From: Ingo Molnar @ 2011-05-27  7:15 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Tony Luck, linux-kernel, kees.cook, davej, torvalds, adobriyan,
	eranian, penberg, davem, Arjan van de Ven, hpa, Valdis.Kletnieks,
	Andrew Morton, pageexec, Vivek Goyal


* Dan Rosenberg <drosenberg@vsecurity.com> wrote:

> 5. I'll be switching to per-cpu IDTs, basing my work on the 
> following patch:
> 
> http://marc.info/?l=linux-kernel&m=112767117501231&w=2
> 
> Any review or comments on the above patch would be helpful.  I'm 
> considering submitting this portion separately, as it may provide 
> performance and scalability benefits regardless of randomization.

Yeah.

Note that you do not have to do the MSI thing in Zwane's patch, nor 
do i think do you need to touch the boot IDT, but instead go for the 
easiest route:

There are two main places that set up the IDT:

  trap_init();
  init_IRQ()

The IDT is fully set up at this point and i don't think we change it 
later on. So all the fancy changes to set_intr_gate() et al in 
Zwane's patch seem unnecessary to me.

Most of the complexity Zwane's patch has comes from the fact that he 
tries to use per CPU IDTs to create *assymetric* IDTs between CPUs - 
but we do not want nor need to do that with your patch, which 
simplifies things enormously.

Note that both of the above init functions execute only on the boot 
CPU, well before SMP is initialized. So it is an easy environment to 
work in from an IDT switcheroo POV and we should be able to switch to 
the percpu IDT there without much fuss.

Note that setup_per_cpu_areas() is called well before trap_init(), so 
at the end of init_IRQ() you can rely on percpu facilities such as 
percpu_alloc() as well.

I'd suggest these rough steps to implement it:

 - turn off CONFIG_SMP in the .config

 - first add the new init function call to the end of arch/x86/'s 
   init_IRQ(), put the percpu_alloc() into that function, copy
   the old IDT into the new IDT (but do not load it!) and boot test 
   the patch.

   At this point you wont have any change to the IDT yet, but you
   have tested all the boot CPU init order assumptions: is 
   percpu_alloc() really available, did you do the copying right, 
   etc. You might want to print-dump the new IDT in hexdump format
   and check whether it looks like an IDT you'd like the CPU to load.

 - then add the one extra line that loads the new IDT into the CPU.

   If the kernel does not crash then you will have a randomized UP
   kernel that does not leak the randomization secret to user-space 
   via the SIDT instructon. Test this in user-space, marvel at the
   non-kernel-image address you get! :-)

 - turn on CONFIG_SMP=y and boot the kernel.

   The kernel should not crash: you will have the boot CPU with
   the percpu IDT, and all secondary CPUs with the bootup IDT
   still referenced. Check via your user-space SIDT test-code
   and:

             taskset 0 ./test-sidt
             taskset 1 ./test-sidt
             taskset 2 ./test-sidt

   That indeed CPU#0 has a different IDT address from all the other
   CPUs. Marvel at the incomplete but still fully working IDT setup! 
   :-)

 - Figure out where a new secondary CPU loads the boot IDT. Figure 
   out where it sets up its percpu area. Find the spot where both
   facilities are available already and add the percpu_alloc()+copy 
   routine to it. Do the hex printout and boot the kernel - do the 
   dumped IDTs look sane visually?

 - If they looked fine then add the one extra line that loads the
   new IDT into the secondary CPU(s). Boot and check the IDTs:

             taskset 0 ./test-sidt
             taskset 1 ./test-sidt
             taskset 2 ./test-sidt

   Now you should have different results on all different CPUs!
   Marvel at having completed the patch!

 - Please check whether the IDT has alignment requirements: we could 
   actually benefit from coloring the percpu IDTs a bit, as each 
   hyperthread (and core) has a separate IDT so we can spread out any 
   cache and RAM accesses a bit better amongst the cache/memory 
   ports.

 - Please check how fast SIDT is, how many cycles does it take? If 
   it's faster than CPUID then you have also created another nice 
   scalability feature: a user-space instruction that emits the 
   current CPU ID! [we could encode the CPU ID in the address - this 
   will also give us the cache coloring.]

Note that using the percpu area will also avoid the 4K mapping TLB 
problem Linus referred to: the percpu area is mapped in a 2MB data 
TLB.

What this stage wont allow yet is a read-only IDT. That should be yet 
another patch on top of this: the percpu IDT will already allow the 
protection of the kernel image randomization secret.

The read-only IDT will bring in the 4K TLB cost but maybe that's 
acceptable (because the security advantages of a read-only IDT are 
real). It will be a relatively easy patch on top of the percpu IDT 
patch: where you load the percpu IDT into the CPU with the LIDT 
instruction, you'd first fixmap it into a readonly page:

	__set_fixmap(FIX_IDT, __pa(percpu_idt_ptr), PAGE_KERNEL_RO);

And use __fix_to_virt(FIX_IDT) as the load_IDT() address.

If you do it as two patches on top of each other i'll try to figure 
out a way to measure the performance impact of the readonly IDT via 
perf. It won't be easy as the expected effect is very, very small.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-26 20:31     ` Vivek Goyal
@ 2011-05-27  9:36       ` Ingo Molnar
  0 siblings, 0 replies; 95+ messages in thread
From: Ingo Molnar @ 2011-05-27  9:36 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Valdis.Kletnieks, Dan Rosenberg, Tony Luck, linux-kernel, davej,
	kees.cook, davem, eranian, torvalds, adobriyan, penberg, hpa,
	Arjan van de Ven, Andrew Morton, pageexec


* Vivek Goyal <vgoyal@redhat.com> wrote:

> On Thu, May 26, 2011 at 04:16:05PM -0400, Valdis.Kletnieks@vt.edu wrote:
> > On Thu, 26 May 2011 16:01:21 EDT, Vivek Goyal said:
> > 
> > > Also randomization of kernel load address at run time will probably have
> > > some issues with crashkernel=X@Y address syntax. So far user knew what
> > > address first kernel is booting from and user could speicy where to 
> > > reserve memory. Now it might happen that user specified some memory
> > > to reserve and kernel decided to occupy that space resulting in failed
> > > memory reservation for crash kernel.
> > 
> > That is however fixable - the randomizer just needs to make sure it doesn't
> > overlay the crashkernel= space, and the crashkernel needs to be started with a
> > 'norandomize' parameter.
> 
> That can be done but at the same time if kernel does not find any suitable
> range to boot from, it should override crashkernel=X@Y settings and fail
> crash memory reservation.
> 
> I guess with randomize space thing a more suitable crash kernel command
> line will be crashkernel=X where kernel decides the base address for
> second kernel depending on availability.
> 
> > If your threat model includes attacks on the
> > crashkernel that randomizing will help with, you got bigger problems. ;)
> > 
> 
> :-) I think norandomize for kdump kernel should be just fine.

Dan, please always generate a very clear printk when randomization is 
off - if we implement everything correctly then it will be impossible 
for even the admin to determine whether there's kernel image 
randomization going on on a system! :-)

Btw., systems with signed modules and with an inability for even root 
to break into the kernel probably want to disable the pagetable 
dumper in debugfs, that will show the exact location of the kernel 
image.

(Btw., please also check that unprivileged users cannot read that 
file.)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-26 20:55       ` Vivek Goyal
@ 2011-05-27  9:38         ` Ingo Molnar
  2011-05-27 13:07           ` Vivek Goyal
  0 siblings, 1 reply; 95+ messages in thread
From: Ingo Molnar @ 2011-05-27  9:38 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Dan Rosenberg, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, pageexec


* Vivek Goyal <vgoyal@redhat.com> wrote:

> > Is it common to run kexec-tools as non-root?  It may be necessary 
> > to restrict this interface to root when randomization is used 
> > (keep in mind nobody's going to force you to turn this on by 
> > default, at least for the foreseeable future).
> 
> kexec-tools runs as root. And I see that /proc/iomem permissions 
> are also for root only. So it probably is a non-issue.

it might be an issue to keep in mind for later projects that try to 
lock down root itself from being able to patch the kernel (other than 
rebooting the box), using signed modules, disabled direct-ioport 
access, and other hardened facilities.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27  2:45     ` Dave Jones
@ 2011-05-27  9:40       ` Ingo Molnar
  2011-05-27 16:11         ` Rafael J. Wysocki
  0 siblings, 1 reply; 95+ messages in thread
From: Ingo Molnar @ 2011-05-27  9:40 UTC (permalink / raw)
  To: Dave Jones, H. Peter Anvin, Rafael J. Wysocki, Dan Rosenberg,
	Tony Luck, linux-kernel, kees.cook, davem, eranian, torvalds,
	adobriyan, penberg, Arjan van de Ven, Andrew Morton,
	Valdis.Kletnieks, pageexec


* Dave Jones <davej@redhat.com> wrote:

> On Thu, May 26, 2011 at 03:32:13PM -0700, H. Peter Anvin wrote:
>  > On 05/26/2011 03:18 PM, Rafael J. Wysocki wrote:
>  > > 
>  > > Well, as far as I can tell, this feature is going to break hibernation on
>  > > both x86_32 and x86_64 at the moment, unless you can guarantee that the
>  > > randomized kernel location will be the same for both the boot and the target
>  > > kernels.
>  > > 
>  > 
>  > Obviously we can't and we don't.  I'm a bit surprised at that
>  > constraint... how can that constraint not break things like kernels of
>  > slightly different size?
> 
> In Fedora at least, we make sure the kernel you thaw from is the 
> same one you booted by diddling with grub to force the right kernel 
> to be booted.

Btw., the hibernation code should save a signature and make sure that 
the two kernels match! It's really broken if the code allows blind 
thawing ...

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27  9:38         ` Ingo Molnar
@ 2011-05-27 13:07           ` Vivek Goyal
  2011-05-27 13:38             ` Ingo Molnar
  0 siblings, 1 reply; 95+ messages in thread
From: Vivek Goyal @ 2011-05-27 13:07 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Dan Rosenberg, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, pageexec, Eric Paris

On Fri, May 27, 2011 at 11:38:53AM +0200, Ingo Molnar wrote:
> 
> * Vivek Goyal <vgoyal@redhat.com> wrote:
> 
> > > Is it common to run kexec-tools as non-root?  It may be necessary 
> > > to restrict this interface to root when randomization is used 
> > > (keep in mind nobody's going to force you to turn this on by 
> > > default, at least for the foreseeable future).
> > 
> > kexec-tools runs as root. And I see that /proc/iomem permissions 
> > are also for root only. So it probably is a non-issue.
> 
> it might be an issue to keep in mind for later projects that try to 
> lock down root itself from being able to patch the kernel (other than 
> rebooting the box), using signed modules, disabled direct-ioport 
> access, and other hardened facilities.

For such environments, Eric Paris had posted a patch to be able to 
disable loading of kexec/kdump kernel, similar to disabling module loading.

https://lkml.org/lkml/2011/1/19/412

I don't see that in Linus's tree. So looks like it never got committed.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-26 20:44     ` Dan Rosenberg
  2011-05-26 20:55       ` Vivek Goyal
@ 2011-05-27 13:13       ` Vivek Goyal
  2011-05-27 13:21         ` Dan Rosenberg
  1 sibling, 1 reply; 95+ messages in thread
From: Vivek Goyal @ 2011-05-27 13:13 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, Ingo Molnar, pageexec

On Thu, May 26, 2011 at 04:44:34PM -0400, Dan Rosenberg wrote:
> On Thu, 2011-05-26 at 16:40 -0400, Vivek Goyal wrote:
> > On Thu, May 26, 2011 at 04:35:02PM -0400, Vivek Goyal wrote:
> > > On Tue, May 24, 2011 at 04:31:45PM -0400, Dan Rosenberg wrote:
> > > > This introduces CONFIG_RANDOMIZE_BASE, which randomizes the address at
> > > > which the kernel is decompressed at boot as a security feature that
> > > > deters exploit attempts relying on knowledge of the location of kernel
> > > > internals.  The default values of the kptr_restrict and dmesg_restrict
> > > > sysctls are set to (1) when this is enabled, since hiding kernel
> > > > pointers is necessary to preserve the secrecy of the randomized base
> > > > address.
> > > 
> > > What happens to /proc/iomem interface which gives us the physical memory
> > > location where kernel is loaded. kexec-tools relies on that interface
> > > heavily so we can not take it away. And if we can not take it away then
> > > I think somebody should be easibly be able to calculate this randomized
> > > base address.
> 
> Is it common to run kexec-tools as non-root?  It may be necessary to
> restrict this interface to root when randomization is used (keep in mind
> nobody's going to force you to turn this on by default, at least for the
> foreseeable future).

Dan, 

I had a stupid question. /proc/kallsyms is also readable by root only. So
if we are doing this so that non-root user can not know kernel virtual and
physical address that should be already covered as non-root users can't
read /proc/kallsysm or /boot/System.map.

And if this randomization is also to protect information from root user
then /proc/iomem exporting the physical address of kernel is still a
valid question in that context.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 13:13       ` Vivek Goyal
@ 2011-05-27 13:21         ` Dan Rosenberg
  2011-05-27 13:46           ` Ingo Molnar
  2011-05-27 13:50           ` Vivek Goyal
  0 siblings, 2 replies; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-27 13:21 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, Ingo Molnar, pageexec

On Fri, 2011-05-27 at 09:13 -0400, Vivek Goyal wrote:
> On Thu, May 26, 2011 at 04:44:34PM -0400, Dan Rosenberg wrote:
> > On Thu, 2011-05-26 at 16:40 -0400, Vivek Goyal wrote:
> > > On Thu, May 26, 2011 at 04:35:02PM -0400, Vivek Goyal wrote:
> > > > On Tue, May 24, 2011 at 04:31:45PM -0400, Dan Rosenberg wrote:
> > > > > This introduces CONFIG_RANDOMIZE_BASE, which randomizes the address at
> > > > > which the kernel is decompressed at boot as a security feature that
> > > > > deters exploit attempts relying on knowledge of the location of kernel
> > > > > internals.  The default values of the kptr_restrict and dmesg_restrict
> > > > > sysctls are set to (1) when this is enabled, since hiding kernel
> > > > > pointers is necessary to preserve the secrecy of the randomized base
> > > > > address.
> > > > 
> > > > What happens to /proc/iomem interface which gives us the physical memory
> > > > location where kernel is loaded. kexec-tools relies on that interface
> > > > heavily so we can not take it away. And if we can not take it away then
> > > > I think somebody should be easibly be able to calculate this randomized
> > > > base address.
> > 
> > Is it common to run kexec-tools as non-root?  It may be necessary to
> > restrict this interface to root when randomization is used (keep in mind
> > nobody's going to force you to turn this on by default, at least for the
> > foreseeable future).
> 
> Dan, 
> 
> I had a stupid question. /proc/kallsyms is also readable by root only. So
> if we are doing this so that non-root user can not know kernel virtual and
> physical address that should be already covered as non-root users can't
> read /proc/kallsysm or /boot/System.map.
> 

Not sure what system you're running, but /proc/kallsyms is 0444 on my
machine (and in mainline, afaik).  Likewise for /proc/iomem.

The problem is mainly with distribution kernels - it's trivial to just
grab an identical vmlinux to a target machine and then you instantly
know exactly where everything is.

> And if this randomization is also to protect information from root user
> then /proc/iomem exporting the physical address of kernel is still a
> valid question in that context.
> 

I think we can deal with unprivileged users first, and if we want to
truly prevent root from finding this out, we can introduce a separate
toggle that locks things down further.

-Dan

> Thanks
> Vivek



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 13:07           ` Vivek Goyal
@ 2011-05-27 13:38             ` Ingo Molnar
  0 siblings, 0 replies; 95+ messages in thread
From: Ingo Molnar @ 2011-05-27 13:38 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Dan Rosenberg, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, pageexec, Eric Paris


* Vivek Goyal <vgoyal@redhat.com> wrote:

> On Fri, May 27, 2011 at 11:38:53AM +0200, Ingo Molnar wrote:
> > 
> > * Vivek Goyal <vgoyal@redhat.com> wrote:
> > 
> > > > Is it common to run kexec-tools as non-root?  It may be necessary 
> > > > to restrict this interface to root when randomization is used 
> > > > (keep in mind nobody's going to force you to turn this on by 
> > > > default, at least for the foreseeable future).
> > > 
> > > kexec-tools runs as root. And I see that /proc/iomem permissions 
> > > are also for root only. So it probably is a non-issue.
> > 
> > it might be an issue to keep in mind for later projects that try to 
> > lock down root itself from being able to patch the kernel (other than 
> > rebooting the box), using signed modules, disabled direct-ioport 
> > access, and other hardened facilities.
> 
> For such environments, Eric Paris had posted a patch to be able to 
> disable loading of kexec/kdump kernel, similar to disabling module 
> loading.
> 
>    https://lkml.org/lkml/2011/1/19/412
> 
> I don't see that in Linus's tree. So looks like it never got 
> committed.

That patch looks sane enough. Ping akpm about it please?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 13:21         ` Dan Rosenberg
@ 2011-05-27 13:46           ` Ingo Molnar
  2011-05-27 13:50           ` Vivek Goyal
  1 sibling, 0 replies; 95+ messages in thread
From: Ingo Molnar @ 2011-05-27 13:46 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Vivek Goyal, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, pageexec


* Dan Rosenberg <drosenberg@vsecurity.com> wrote:

> > And if this randomization is also to protect information from 
> > root user then /proc/iomem exporting the physical address of 
> > kernel is still a valid question in that context.
> 
> I think we can deal with unprivileged users first, and if we want 
> to truly prevent root from finding this out, we can introduce a 
> separate toggle that locks things down further.

Correct, the case of unprivileged users should be handled first and 
it should be handled separately from any root-restrictions.

I only raised this to have a rough record of what would have to 
happen there.

Once all is said, done, committed and tested (the last two not 
necessarily in that order), we can look at any open root-restrict 
questions. It's a lot less clear-cut from a system usability POV.

If we do it we probably want one central one-shot 'restrict root from 
now on' toggle, not the separate switches that kill kexec and module 
loading separately.

Some shops might even want to disable root from being able to reboot 
the system and restrict reboots to physically performed (and 
crash/panic/hang induced) reboots only.

Some shops might want to make reboots dependent on the provision of a 
secret key. That key would not be stored on that system.

So there's lots of details to sort out in the "keep root from being 
able to break into the kernel and hide a rootkit out and disappear" 
area.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 13:21         ` Dan Rosenberg
  2011-05-27 13:46           ` Ingo Molnar
@ 2011-05-27 13:50           ` Vivek Goyal
  1 sibling, 0 replies; 95+ messages in thread
From: Vivek Goyal @ 2011-05-27 13:50 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	torvalds, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, Ingo Molnar, pageexec

On Fri, May 27, 2011 at 09:21:32AM -0400, Dan Rosenberg wrote:
> On Fri, 2011-05-27 at 09:13 -0400, Vivek Goyal wrote:
> > On Thu, May 26, 2011 at 04:44:34PM -0400, Dan Rosenberg wrote:
> > > On Thu, 2011-05-26 at 16:40 -0400, Vivek Goyal wrote:
> > > > On Thu, May 26, 2011 at 04:35:02PM -0400, Vivek Goyal wrote:
> > > > > On Tue, May 24, 2011 at 04:31:45PM -0400, Dan Rosenberg wrote:
> > > > > > This introduces CONFIG_RANDOMIZE_BASE, which randomizes the address at
> > > > > > which the kernel is decompressed at boot as a security feature that
> > > > > > deters exploit attempts relying on knowledge of the location of kernel
> > > > > > internals.  The default values of the kptr_restrict and dmesg_restrict
> > > > > > sysctls are set to (1) when this is enabled, since hiding kernel
> > > > > > pointers is necessary to preserve the secrecy of the randomized base
> > > > > > address.
> > > > > 
> > > > > What happens to /proc/iomem interface which gives us the physical memory
> > > > > location where kernel is loaded. kexec-tools relies on that interface
> > > > > heavily so we can not take it away. And if we can not take it away then
> > > > > I think somebody should be easibly be able to calculate this randomized
> > > > > base address.
> > > 
> > > Is it common to run kexec-tools as non-root?  It may be necessary to
> > > restrict this interface to root when randomization is used (keep in mind
> > > nobody's going to force you to turn this on by default, at least for the
> > > foreseeable future).
> > 
> > Dan, 
> > 
> > I had a stupid question. /proc/kallsyms is also readable by root only. So
> > if we are doing this so that non-root user can not know kernel virtual and
> > physical address that should be already covered as non-root users can't
> > read /proc/kallsysm or /boot/System.map.
> > 
> 
> Not sure what system you're running, but /proc/kallsyms is 0444 on my
> machine (and in mainline, afaik).  Likewise for /proc/iomem.

Sorry. I read it wrong. Yes /proc/iomem and /proc/kallsyms are 0444.

> 
> The problem is mainly with distribution kernels - it's trivial to just
> grab an identical vmlinux to a target machine and then you instantly
> know exactly where everything is.
> 
> > And if this randomization is also to protect information from root user
> > then /proc/iomem exporting the physical address of kernel is still a
> > valid question in that context.
> > 
> 
> I think we can deal with unprivileged users first, and if we want to
> truly prevent root from finding this out, we can introduce a separate
> toggle that locks things down further.

Ok, given the fact that /proc/iomem is 0444 and it carries the physical
address of kernel, it think it should be easy to calcualte the randomized
offset.  So I guess we shall have to do something about that too.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-26 22:18 ` Rafael J. Wysocki
  2011-05-26 22:32   ` H. Peter Anvin
@ 2011-05-27 15:42   ` Linus Torvalds
  2011-05-27 16:11     ` Dan Rosenberg
  2011-05-27 17:00     ` Ingo Molnar
  1 sibling, 2 replies; 95+ messages in thread
From: Linus Torvalds @ 2011-05-27 15:42 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Dan Rosenberg, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, Ingo Molnar, pageexec

On Thu, May 26, 2011 at 3:18 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>
> Well, as far as I can tell, this feature is going to break hibernation on
> both x86_32 and x86_64 at the moment, unless you can guarantee that the
> randomized kernel location will be the same for both the boot and the target
> kernels.

You know what? Maybe that guarantee is actually the *right* thing to do..

In other words, maybe we really really shouldn't randomize the kernel
load address at boot time at all.

Instead, what would be much better, is if we just had some way to
re-link distro kernels with some random text offset. Sure, the load
address wouldn't be "random" in any local sense any more, but I think
the real effort here was to avoid having the common distro kernels
having known text addresses.

If you compile your own kernel version, you're already home free, and
load-time randomization is pointless.

And load-time randomization has all these nasty problems with memory
maps etc, because we obviously have to shift the whole kernel around
by some fixed offset. But if there was some way to just re-link the
distro kernel easily, then it could be done by the kernel install
scripts, and it could potentially do more than just "shift up load
address by some random number".

Hmm?

                          Linus

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-26 22:32   ` H. Peter Anvin
  2011-05-27  0:26     ` Dan Rosenberg
  2011-05-27  2:45     ` Dave Jones
@ 2011-05-27 16:07     ` Rafael J. Wysocki
  2 siblings, 0 replies; 95+ messages in thread
From: Rafael J. Wysocki @ 2011-05-27 16:07 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Dan Rosenberg, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, torvalds, adobriyan, penberg, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, Ingo Molnar, pageexec

On Friday, May 27, 2011, H. Peter Anvin wrote:
> On 05/26/2011 03:18 PM, Rafael J. Wysocki wrote:
> > 
> > Well, as far as I can tell, this feature is going to break hibernation on
> > both x86_32 and x86_64 at the moment, unless you can guarantee that the
> > randomized kernel location will be the same for both the boot and the target
> > kernels.
> > 
> 
> Obviously we can't and we don't.  I'm a bit surprised at that
> constraint... how can that constraint not break things like kernels of
> slightly different size?

Our hibernation code generally requires that the kernel used for loading
the image be the same as the hibernated one.  This requirement is slightly
lifted for x86_64, but still we don't have a mechanism for passing the
jump address into the hibernated header in the image header.

I planned to add that, but then didn't have the time to work on it.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 15:42   ` Linus Torvalds
@ 2011-05-27 16:11     ` Dan Rosenberg
  2011-05-27 17:00     ` Ingo Molnar
  1 sibling, 0 replies; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-27 16:11 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, Tony Luck, linux-kernel, davej, kees.cook,
	davem, eranian, adobriyan, penberg, hpa, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, Ingo Molnar, pageexec

On Fri, 2011-05-27 at 08:42 -0700, Linus Torvalds wrote:
> On Thu, May 26, 2011 at 3:18 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >
> > Well, as far as I can tell, this feature is going to break hibernation on
> > both x86_32 and x86_64 at the moment, unless you can guarantee that the
> > randomized kernel location will be the same for both the boot and the target
> > kernels.
> 
> You know what? Maybe that guarantee is actually the *right* thing to do..
> 
> In other words, maybe we really really shouldn't randomize the kernel
> load address at boot time at all.
> 
> Instead, what would be much better, is if we just had some way to
> re-link distro kernels with some random text offset. Sure, the load
> address wouldn't be "random" in any local sense any more, but I think
> the real effort here was to avoid having the common distro kernels
> having known text addresses.
> 
> If you compile your own kernel version, you're already home free, and
> load-time randomization is pointless.
> 
> And load-time randomization has all these nasty problems with memory
> maps etc, because we obviously have to shift the whole kernel around
> by some fixed offset. But if there was some way to just re-link the
> distro kernel easily, then it could be done by the kernel install
> scripts, and it could potentially do more than just "shift up load
> address by some random number".
> 
> Hmm?
> 
>                           Linus

You know what...I'm surprised that I'm saying this, but given the number
of non-trivial challenges that still need to be solved in order to
implement load-time randomization, maybe this would be a better way
forward.

We'd still need to go through the same effort to hide information about
kernel text offsets, and we'd still need to do per-cpu IDTs, but neither
of those items are as challenging as some of the other problems.

I'm not ready to take load-time randomization off the table, but I'd
certainly like to hear more discussion on this.  There are clearly
advantages to load-time randomization that this new option wouldn't
have, but the question is really "is what we gain worth the effort?".

Thanks,
Dan


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27  9:40       ` Ingo Molnar
@ 2011-05-27 16:11         ` Rafael J. Wysocki
  0 siblings, 0 replies; 95+ messages in thread
From: Rafael J. Wysocki @ 2011-05-27 16:11 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Dave Jones, H. Peter Anvin, Dan Rosenberg, Tony Luck,
	linux-kernel, kees.cook, davem, eranian, torvalds, adobriyan,
	penberg, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec

On Friday, May 27, 2011, Ingo Molnar wrote:
> 
> * Dave Jones <davej@redhat.com> wrote:
> 
> > On Thu, May 26, 2011 at 03:32:13PM -0700, H. Peter Anvin wrote:
> >  > On 05/26/2011 03:18 PM, Rafael J. Wysocki wrote:
> >  > > 
> >  > > Well, as far as I can tell, this feature is going to break hibernation on
> >  > > both x86_32 and x86_64 at the moment, unless you can guarantee that the
> >  > > randomized kernel location will be the same for both the boot and the target
> >  > > kernels.
> >  > > 
> >  > 
> >  > Obviously we can't and we don't.  I'm a bit surprised at that
> >  > constraint... how can that constraint not break things like kernels of
> >  > slightly different size?
> > 
> > In Fedora at least, we make sure the kernel you thaw from is the 
> > same one you booted by diddling with grub to force the right kernel 
> > to be booted.
> 
> Btw., the hibernation code should save a signature and make sure that 
> the two kernels match! It's really broken if the code allows blind 
> thawing ...

It uses signatures, but on x86_64 you actually can use a different kernel
for loading the image, with some limitations.

I'd like to add a mechanism for passing the jump address into the hibernated
kernel in the kernel image, but that part is still missing.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27  0:26     ` Dan Rosenberg
@ 2011-05-27 16:21       ` Rafael J. Wysocki
  0 siblings, 0 replies; 95+ messages in thread
From: Rafael J. Wysocki @ 2011-05-27 16:21 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: H. Peter Anvin, Tony Luck, linux-kernel, davej, kees.cook, davem,
	eranian, torvalds, adobriyan, penberg, Arjan van de Ven,
	Andrew Morton, Valdis.Kletnieks, Ingo Molnar, pageexec

On Friday, May 27, 2011, Dan Rosenberg wrote:
> On Thu, 2011-05-26 at 15:32 -0700, H. Peter Anvin wrote:
> > On 05/26/2011 03:18 PM, Rafael J. Wysocki wrote:
> > > 
> > > Well, as far as I can tell, this feature is going to break hibernation on
> > > both x86_32 and x86_64 at the moment, unless you can guarantee that the
> > > randomized kernel location will be the same for both the boot and the target
> > > kernels.
> > > 
> > 
> > Obviously we can't and we don't.  I'm a bit surprised at that
> > constraint... how can that constraint not break things like kernels of
> > slightly different size?
> > 
> > 	-hpa
> 
> Am I understanding it correctly that hibernation is currently operating
> under a possibly false assumption?  If it's the case that hibernation
> should be saving the physical address at which the kernel was previously
> loaded and restoring it there regardless of randomization, it would
> certainly help me out if someone familiar with the code could take a
> stab at that.

It rather has to save the address where to jump into the image kernel from
the boot kernel, but ISTR that's not straightforward.  I thought about
implementing something like this some time ago, but finally I didn't have
the time to finish that work.

At the moment I'm preparing for a trip to Japan, so I'll be able to work on
this with you when I get back home (some time next weekend).  In the
meantime, please have a look at arch/x86/power/hibernate_64.c and
arch/x86/power/hibernate_asm_64.S.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 15:42   ` Linus Torvalds
  2011-05-27 16:11     ` Dan Rosenberg
@ 2011-05-27 17:00     ` Ingo Molnar
  2011-05-27 17:06       ` H. Peter Anvin
  2011-05-27 17:10       ` Dan Rosenberg
  1 sibling, 2 replies; 95+ messages in thread
From: Ingo Molnar @ 2011-05-27 17:00 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Rafael J. Wysocki, Dan Rosenberg, Tony Luck, linux-kernel, davej,
	kees.cook, davem, eranian, adobriyan, penberg, hpa,
	Arjan van de Ven, Andrew Morton, Valdis.Kletnieks, pageexec


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> If you compile your own kernel version, you're already home free, 
> and load-time randomization is pointless.

Most successful exploits work in two steps: first a local exploit 
(weak password with a user, stupid script escaping bug, or a buffer 
overflow somewhere), then a local kernel exploit to gain root and 
kernel access. (for a rootkit and what not)

Straight remote root exploits are pretty rare - and per system 
relinking only protects against that.

The problem with your relinking solution is that a local attacker can 
easily figure out where the kernel is. So this does not protect 
against the more common break-in scenario.

Kernel image randomization makes this last step really 
indeterministic and thus dangerous to attackers.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 17:00     ` Ingo Molnar
@ 2011-05-27 17:06       ` H. Peter Anvin
  2011-05-27 17:10       ` Dan Rosenberg
  1 sibling, 0 replies; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-27 17:06 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Rafael J. Wysocki, Dan Rosenberg, Tony Luck,
	linux-kernel, davej, kees.cook, davem, eranian, adobriyan,
	penberg, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec

On 05/27/2011 10:00 AM, Ingo Molnar wrote:
> 
> The problem with your relinking solution is that a local attacker can 
> easily figure out where the kernel is. So this does not protect 
> against the more common break-in scenario.
> 

There is another issue with it: it doesn't actually solve the real
problem other than suspend/resume, which is that the relocation agent
needs to understand what the memory space looks like at the time of boot.

I think something else we will need for this to be possible is initramfs
decoding directly from highmem, since the hack we're currently using to
deal with an initramfs/initrd located partly in highmem will break.

	-hpa

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 17:00     ` Ingo Molnar
  2011-05-27 17:06       ` H. Peter Anvin
@ 2011-05-27 17:10       ` Dan Rosenberg
  2011-05-27 17:13         ` H. Peter Anvin
  2011-05-27 17:16         ` Ingo Molnar
  1 sibling, 2 replies; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-27 17:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Rafael J. Wysocki, Tony Luck, linux-kernel,
	davej, kees.cook, davem, eranian, adobriyan, penberg, hpa,
	Arjan van de Ven, Andrew Morton, Valdis.Kletnieks, pageexec

On Fri, 2011-05-27 at 19:00 +0200, Ingo Molnar wrote:
> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > If you compile your own kernel version, you're already home free, 
> > and load-time randomization is pointless.

> The problem with your relinking solution is that a local attacker can 
> easily figure out where the kernel is. So this does not protect 
> against the more common break-in scenario.
> 
> Kernel image randomization makes this last step really 
> indeterministic and thus dangerous to attackers.
> 

Just to play devil's advocate, how is it easier for a local attacker to
figure out where kernel internals are if it's been relinked vs.
randomized at load time, assuming we follow through on fixing the info
leaks?

It seems to me that the only functional difference is that subsequent
reboots will yield the same memory layout, which is a real drawback
worth considering.

-Dan

> Thanks,
> 
> 	Ingo



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 17:10       ` Dan Rosenberg
@ 2011-05-27 17:13         ` H. Peter Anvin
  2011-05-27 17:16           ` Linus Torvalds
  2011-05-27 17:20           ` Kees Cook
  2011-05-27 17:16         ` Ingo Molnar
  1 sibling, 2 replies; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-27 17:13 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Ingo Molnar, Linus Torvalds, Rafael J. Wysocki, Tony Luck,
	linux-kernel, davej, kees.cook, davem, eranian, adobriyan,
	penberg, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec

On 05/27/2011 10:10 AM, Dan Rosenberg wrote:
> 
> Just to play devil's advocate, how is it easier for a local attacker to
> figure out where kernel internals are if it's been relinked vs.
> randomized at load time, assuming we follow through on fixing the info
> leaks?
> 

You can read the on-disk kernel file and find out.

	-hpa

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 17:10       ` Dan Rosenberg
  2011-05-27 17:13         ` H. Peter Anvin
@ 2011-05-27 17:16         ` Ingo Molnar
  2011-05-27 17:21           ` Linus Torvalds
  1 sibling, 1 reply; 95+ messages in thread
From: Ingo Molnar @ 2011-05-27 17:16 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Linus Torvalds, Rafael J. Wysocki, Tony Luck, linux-kernel,
	davej, kees.cook, davem, eranian, adobriyan, penberg, hpa,
	Arjan van de Ven, Andrew Morton, Valdis.Kletnieks, pageexec


* Dan Rosenberg <drosenberg@vsecurity.com> wrote:

> Just to play devil's advocate, how is it easier for a local 
> attacker to figure out where kernel internals are if it's been 
> relinked vs. randomized at load time, assuming we follow through on 
> fixing the info leaks?

Well, 'fixing the info leaks' will obfuscate previously useful files 
such as /proc/kallsyms ...

That's one of the advantages of randomization: it allows us to expose 
RIPs without them being an instant information leak.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 17:13         ` H. Peter Anvin
@ 2011-05-27 17:16           ` Linus Torvalds
  2011-05-27 17:38             ` Ingo Molnar
  2011-05-27 17:20           ` Kees Cook
  1 sibling, 1 reply; 95+ messages in thread
From: Linus Torvalds @ 2011-05-27 17:16 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Dan Rosenberg, Ingo Molnar, Rafael J. Wysocki, Tony Luck,
	linux-kernel, davej, kees.cook, davem, eranian, adobriyan,
	penberg, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec

On Fri, May 27, 2011 at 10:13 AM, H. Peter Anvin <hpa@zytor.com> wrote:
>
> You can read the on-disk kernel file and find out.

So? Make it root-readable-only. Problem solved.

That's the _only_ difference, and it's trivial and irrelevant. Come up
with something more real, please.

                 Linus

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 17:13         ` H. Peter Anvin
  2011-05-27 17:16           ` Linus Torvalds
@ 2011-05-27 17:20           ` Kees Cook
  1 sibling, 0 replies; 95+ messages in thread
From: Kees Cook @ 2011-05-27 17:20 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Dan Rosenberg, Ingo Molnar, Linus Torvalds, Rafael J. Wysocki,
	Tony Luck, linux-kernel, davej, davem, eranian, adobriyan,
	penberg, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec

On Fri, May 27, 2011 at 10:13:54AM -0700, H. Peter Anvin wrote:
> On 05/27/2011 10:10 AM, Dan Rosenberg wrote:
> > 
> > Just to play devil's advocate, how is it easier for a local attacker to
> > figure out where kernel internals are if it's been relinked vs.
> > randomized at load time, assuming we follow through on fixing the info
> > leaks?
> > 
> 
> You can read the on-disk kernel file and find out.

If we're still operating under the assumption of "defend against non-root",
distros can trivially make the on-disk kernels 0400.

-Kees

-- 
Kees Cook
Ubuntu Security Team

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 17:16         ` Ingo Molnar
@ 2011-05-27 17:21           ` Linus Torvalds
  2011-05-27 17:46             ` Ingo Molnar
  0 siblings, 1 reply; 95+ messages in thread
From: Linus Torvalds @ 2011-05-27 17:21 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Dan Rosenberg, Rafael J. Wysocki, Tony Luck, linux-kernel, davej,
	kees.cook, davem, eranian, adobriyan, penberg, hpa,
	Arjan van de Ven, Andrew Morton, Valdis.Kletnieks, pageexec

On Fri, May 27, 2011 at 10:16 AM, Ingo Molnar <mingo@elte.hu> wrote:
>
> Well, 'fixing the info leaks' will obfuscate previously useful files
> such as /proc/kallsyms ...

Guys, stop with the crazy already.

YOU HAVE TO DO THAT FOR THE LINK-TIME-OBFUSCATION TOO!

> That's one of the advantages of randomization: it allows us to expose
> RIPs without them being an instant information leak.

Except you clearly aren't thinking that through AT ALL.

The obfuscation of things like /proc/kallsyms is *exactly*the*same*
whether you do the randomization at boot-time or install-time.

For chrissake - you're doing the same thing. The only question is
"when" (and the fact that if you do it at install-time, you can do a
fancier job of it)

Stop wasting peoples time with idiocies, please.

                    Linus

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 17:16           ` Linus Torvalds
@ 2011-05-27 17:38             ` Ingo Molnar
  0 siblings, 0 replies; 95+ messages in thread
From: Ingo Molnar @ 2011-05-27 17:38 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: H. Peter Anvin, Dan Rosenberg, Rafael J. Wysocki, Tony Luck,
	linux-kernel, davej, kees.cook, davem, eranian, adobriyan,
	penberg, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> That's the _only_ difference, and it's trivial and irrelevant. Come 
> up with something more real, please.

The advantages of dynamic per boot kernel randomization, over static 
per system randomization, as i see them, in order of descending 
importance:

 - A root exploit will still not give away the location of the
   kernel (assuming module loading has been disabled after bootup),
   so a rootkit cannot be installed 'silently' on the system, into
   RAM only, evading most offline-storage-checking tools.

   With static linking this is not possible: reading the kernel image
   as root trivially exposes the kernel's location.

 - We can expose RIPs to unprivileged tools. Certain users could
   still kernel-profile a busy server box while neither being root,
   nor having access to the real location of the kernel.

   With static linking this is not possible.

 - Crash & reboot & retry brute force exploits get harder: if one
   attempt at an exploit causes a crash and a reboot, the kernel
   addresses are different after the reboot so the attempt has to be
   retried without the advantage of any prior history.

   With static linking this kind of exploit is somewhat easier: every
   crash gives a permanent proof that the guessed RIP offet was
   wrong, so history can be used on subsequent retries.

 - It gives a way to go one step further in secure server lockdown:
   where even root with full access to all storage has no way to
   break into the kernel. Reboots, module loading and kexec can be
   controlled, ioperm() and iopl() can be restricted. If those are
   taken away then even if a root exploit allows the attacker to
   overwrite the kernel image, a reboot has to be waited for and if
   reboots do sanity checks [based on immutable storage] of the
   system then the exploit can be found.

   With static linking this is not possible: reading the kernel image
   as root trivially exposes the kernel's location.

It's in order of importance: you probably stopped caring at item 2 or 
3 but there's definitely people who'd like to go all the way to 4. So 
if we can do dynamic randomization sanely then why not offer it as an 
option?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 17:21           ` Linus Torvalds
@ 2011-05-27 17:46             ` Ingo Molnar
  2011-05-27 17:53               ` H. Peter Anvin
  2011-05-27 17:57               ` Linus Torvalds
  0 siblings, 2 replies; 95+ messages in thread
From: Ingo Molnar @ 2011-05-27 17:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dan Rosenberg, Rafael J. Wysocki, Tony Luck, linux-kernel, davej,
	kees.cook, davem, eranian, adobriyan, penberg, hpa,
	Arjan van de Ven, Andrew Morton, Valdis.Kletnieks, pageexec


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Fri, May 27, 2011 at 10:16 AM, Ingo Molnar <mingo@elte.hu> wrote:
> >
> > Well, 'fixing the info leaks' will obfuscate previously useful files
> > such as /proc/kallsyms ...
> 
> Guys, stop with the crazy already.
> 
> YOU HAVE TO DO THAT FOR THE LINK-TIME-OBFUSCATION TOO!
>
> > That's one of the advantages of randomization: it allows us to 
> > expose RIPs without them being an instant information leak.
> 
> Except you clearly aren't thinking that through AT ALL.
> 
> The obfuscation of things like /proc/kallsyms is *exactly*the*same* 
> whether you do the randomization at boot-time or install-time.

Well, but two mails ago you said:

> And load-time randomization has all these nasty problems with 
> memory maps etc, because we obviously have to shift the whole 
> kernel around by some fixed offset. But if there was some way to 
> just re-link the distro kernel easily, then it could be done by the 
> kernel install scripts, and it could potentially do more than just 
> "shift up load address by some random number".

If i understood you correctly you suggest randomizing the image by 
shifting the symbols in it around. The boot loader would still load 
an 'image' where it always loads it - just that image itself is 
randomized internally somewhat, right?

( because that's the only way we can avoid the problems with e820 
  memory maps which you referred to, if don't actually change the 
  load address. )

Have i understood you correctly?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 17:46             ` Ingo Molnar
@ 2011-05-27 17:53               ` H. Peter Anvin
  2011-05-27 18:05                 ` Linus Torvalds
  2011-05-27 17:57               ` Linus Torvalds
  1 sibling, 1 reply; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-27 17:53 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Dan Rosenberg, Rafael J. Wysocki, Tony Luck,
	linux-kernel, davej, kees.cook, davem, eranian, adobriyan,
	penberg, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec

On 05/27/2011 10:46 AM, Ingo Molnar wrote:
> 
> If i understood you correctly you suggest randomizing the image by 
> shifting the symbols in it around. The boot loader would still load 
> an 'image' where it always loads it - just that image itself is 
> randomized internally somewhat, right?
> 
> ( because that's the only way we can avoid the problems with e820 
>   memory maps which you referred to, if don't actually change the 
>   load address. )
> 

That doesn't solve any problems with the memory map.
	
	-hpa

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 17:46             ` Ingo Molnar
  2011-05-27 17:53               ` H. Peter Anvin
@ 2011-05-27 17:57               ` Linus Torvalds
  2011-05-27 18:17                 ` Ingo Molnar
  1 sibling, 1 reply; 95+ messages in thread
From: Linus Torvalds @ 2011-05-27 17:57 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Dan Rosenberg, Rafael J. Wysocki, Tony Luck, linux-kernel, davej,
	kees.cook, davem, eranian, adobriyan, penberg, hpa,
	Arjan van de Ven, Andrew Morton, Valdis.Kletnieks, pageexec

On Fri, May 27, 2011 at 10:46 AM, Ingo Molnar <mingo@elte.hu> wrote:
>
> If i understood you correctly you suggest randomizing the image by
> shifting the symbols in it around. The boot loader would still load
> an 'image' where it always loads it - just that image itself is
> randomized internally somewhat, right?

You snipped the other part of my email you responded to:

  For chrissake - you're doing the same thing. The only question is
  "when" (and the fact that if you do it at install-time, you can do a
  fancier job of it)

ie the fact that if you do it at install-time, you have the option of
being much more fancy about it.

So sure, the install time option *can* do more. It doesn't *have* to do more.

But being able to do a better job of randomization is *better*. Ok? It
doesn't mean you have to, but you have more options to do things if
you want to.

IOW, there is absolutely zero difference between doing it at
install-time or run-time, but the install-time one is (a) likely
easier and (b) certainly more flexible. But both of them do the exact
same thing, and require the exact same support in things like
/proc/kallsyms.

Of course, if we end up doing something really fancy (which the
install-time option allows), that obviously does mean that the
remapping by %pK thing for kallsyms needs to be much smarter too.

But at %pK time, you can *afford* to do that kind of things. At
boot-time, before you're even loaded and have a hard time even parsing
the e820 maps? Yeah, you're not going to do anything smart there, I
can tell you.

                   Linus

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 17:53               ` H. Peter Anvin
@ 2011-05-27 18:05                 ` Linus Torvalds
  2011-05-27 19:15                   ` Vivek Goyal
                                     ` (2 more replies)
  0 siblings, 3 replies; 95+ messages in thread
From: Linus Torvalds @ 2011-05-27 18:05 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Ingo Molnar, Dan Rosenberg, Rafael J. Wysocki, Tony Luck,
	linux-kernel, davej, kees.cook, davem, eranian, adobriyan,
	penberg, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec

On Fri, May 27, 2011 at 10:53 AM, H. Peter Anvin <hpa@zytor.com> wrote:
>
> That doesn't solve any problems with the memory map.

Actually, it does.

You can load the kernel at the same virtual address we always load it,
and/or perhaps shift it up by just small amounts (ie "single pages"
rather than "ten bits worth of pages")

And then rely on the fact that you mixed up symbols in other ways.

"Look ma, no need to worry about memory map". At least no more than we do now.

Put another way: think about our /proc/iomem right now:

  00100000-bdc6ffff : System RAM
    01000000-016bdced : Kernel code
    016bdcee-01ca8b7f : Kernel data
    01d36000-01de2fff : Kernel bss

with the "shift kernel up at load-time", the above information is
suddenly very scary, because the "Kernel code" part is magically
important.

In contrast, if your randomization depends on just relinking things a
bit differently, you don't really give out any of the random
information in /proc/iomem. Nor does it affect the load address and
the e820 memory map.

And, in fact, it does give you way more bits of randomness to play
around with the text addresses.

With something like function-sections, it should be possible to do
quite a serious job of relinking (and then keep some "function section
to actual relinked address" mapping around so that you can do the
/proc/kallsyms mappings).

But that's actually the "fancy" model. I don't think we should aim at
that to begin with. Start off with something much less ambitious, like
just shifting the kernel by a few pages. People have argued that even
just a 50% chance of an oops is preferable to nothing. So we can start
small and stupid.

See?

                     Linus

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 17:57               ` Linus Torvalds
@ 2011-05-27 18:17                 ` Ingo Molnar
  2011-05-27 18:43                   ` Kees Cook
                                     ` (2 more replies)
  0 siblings, 3 replies; 95+ messages in thread
From: Ingo Molnar @ 2011-05-27 18:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dan Rosenberg, Rafael J. Wysocki, Tony Luck, linux-kernel, davej,
	kees.cook, davem, eranian, adobriyan, penberg, hpa,
	Arjan van de Ven, Andrew Morton, Valdis.Kletnieks, pageexec


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Fri, May 27, 2011 at 10:46 AM, Ingo Molnar <mingo@elte.hu> wrote:
> >
> > If i understood you correctly you suggest randomizing the image 
> > by shifting the symbols in it around. The boot loader would still 
> > load an 'image' where it always loads it - just that image itself 
> > is randomized internally somewhat, right?
> 
> You snipped the other part of my email you responded to:
> 
>   For chrissake - you're doing the same thing. The only question is
>   "when" (and the fact that if you do it at install-time, you can do a
>   fancier job of it)
> 
> ie the fact that if you do it at install-time, you have the option of
> being much more fancy about it.
> 
> So sure, the install time option *can* do more. It doesn't *have* 
> to do more.
> 
> But being able to do a better job of randomization is *better*. Ok? 
> It doesn't mean you have to, but you have more options to do things 
> if you want to.
> 
> IOW, there is absolutely zero difference between doing it at 
> install-time or run-time, but the install-time one is (a) likely 
> easier and (b) certainly more flexible. But both of them do the 
> exact same thing, and require the exact same support in things like 
> /proc/kallsyms.
> 
> Of course, if we end up doing something really fancy (which the 
> install-time option allows), that obviously does mean that the 
> remapping by %pK thing for kallsyms needs to be much smarter too.
> 
> But at %pK time, you can *afford* to do that kind of things. At 
> boot-time, before you're even loaded and have a hard time even 
> parsing the e820 maps? Yeah, you're not going to do anything smart 
> there, I can tell you.

Ok, you are right, we could patch in all the things into the image at 
install time to be able to 'derandomize' symbols and still be able to 
provide them.

[ One worry i have is that distro logic is to go for the simplest 
  route: which is to randomize the symbols by padding the beginning 
  or the end of the kernel image a bit, but don't bother making %pK 
  smart or fancy. This means that /proc/kallsyms will be restricted 
  (maybe even turned off completely, because it's now broken) and a 
  'real' System.map put, only readable to root. This still 'allows'
  tooling, in a full SystemTap and Oprofile usability fashion. ]

Anyway, this strikes off the second item from my list. Meanwhile i 
also found two other usecases which i added to the head of the list:

 - Boot time dynamic randomization allows randomization of 'mass 
   install' systems, where the same image is used, to still be 
   randomized: for example a million phones all with the same Flash 
   ROM image and no 'install' performed at all on them.

   With static randomization these systems will all have the same
   kernel addresses.

 - Boot time dynamic randomization allows read-only systems to still 
   be randomized: for example internet cafes that use some popular 
   pre-packaged kiosk-mode live-DVD. They probably wont bother 
   randomizing and relinking the ISOs per machine and burning per 
   machine DVDs ...

 - A root exploit will still not give away the location of the
   kernel (assuming module loading has been disabled after bootup),
   so a rootkit cannot be installed 'silently' on the system, into
   RAM only, evading most offline-storage-checking tools.

   With static linking this is not possible: reading the kernel image
   as root trivially exposes the kernel's location.

 - Crash & reboot & retry brute force exploits get harder: if one
   attempt at an exploit causes a crash and a reboot, the kernel
   addresses are different after the reboot so the attempt has to be
   retried without the advantage of any prior history.

   With static linking this kind of exploit is somewhat easier: every
   crash gives a permanent proof that the guessed RIP offet was
   wrong, so history can be used on subsequent retries.

 - It gives a way to go one step further in secure server lockdown:
   where even root with full access to all storage has no way to
   break into the kernel. Reboots, module loading and kexec can be
   controlled, ioperm() and iopl() can be restricted. If those are
   taken away then even if a root exploit allows the attacker to
   overwrite the kernel image, a reboot has to be waited for and if
   reboots do sanity checks [based on immutable storage] of the
   system then the exploit can be found.

   With static linking this is not possible: reading the kernel image
   as root trivially exposes the kernel's location.

Thanks,

	Ingo


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 18:17                 ` Ingo Molnar
@ 2011-05-27 18:43                   ` Kees Cook
  2011-05-27 18:48                   ` david
  2011-05-27 21:51                   ` Olivier Galibert
  2 siblings, 0 replies; 95+ messages in thread
From: Kees Cook @ 2011-05-27 18:43 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Dan Rosenberg, Rafael J. Wysocki, Tony Luck,
	linux-kernel, davej, davem, eranian, adobriyan, penberg, hpa,
	Arjan van de Ven, Andrew Morton, Valdis.Kletnieks, pageexec

On Fri, May 27, 2011 at 08:17:24PM +0200, Ingo Molnar wrote:
>  - Boot time dynamic randomization allows randomization of 'mass 
>    install' systems, where the same image is used, to still be 
>    randomized: for example a million phones all with the same Flash 
>    ROM image and no 'install' performed at all on them.
> 
>    With static randomization these systems will all have the same
>    kernel addresses.
> 
>  - Boot time dynamic randomization allows read-only systems to still 
>    be randomized: for example internet cafes that use some popular 
>    pre-packaged kiosk-mode live-DVD. They probably wont bother 
>    randomizing and relinking the ISOs per machine and burning per 
>    machine DVDs ...

These 2 points are pretty significant, IMO.

And frankly, distros almost fall into these categories already. IIUC,
a distro would need to ship all of the .o files from each config of the
kernel they ship so each system could do the relinking. That's not a
small foot print to suddenly add to base installs.

-Kees

-- 
Kees Cook
Ubuntu Security Team

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 18:17                 ` Ingo Molnar
  2011-05-27 18:43                   ` Kees Cook
@ 2011-05-27 18:48                   ` david
  2011-05-27 21:51                   ` Olivier Galibert
  2 siblings, 0 replies; 95+ messages in thread
From: david @ 2011-05-27 18:48 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Dan Rosenberg, Rafael J. Wysocki, Tony Luck,
	linux-kernel, davej, kees.cook, davem, eranian, adobriyan,
	penberg, hpa, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec

On Fri, 27 May 2011, Ingo Molnar wrote:

I don't think these two new images are as important as you are tagging 
them. I would put them down with the 'protect the system from root' type 
of issues.

> - Boot time dynamic randomization allows randomization of 'mass
>   install' systems, where the same image is used, to still be
>   randomized: for example a million phones all with the same Flash
>   ROM image and no 'install' performed at all on them.
>
>   With static randomization these systems will all have the same
>   kernel addresses.

there is already a need to be able to customize these systems on an 
individual system basis (think SSL certs or ssh keys for example)

yes, this makes it a little more difficult than just 'drop this image bit 
for bit on the system', but it's not that hard to setup a 'the first time 
you boot do this stuff then reboot' step, and that step can do the 
'install time' stuff.

> - Boot time dynamic randomization allows read-only systems to still
>   be randomized: for example internet cafes that use some popular
>   pre-packaged kiosk-mode live-DVD. They probably wont bother
>   randomizing and relinking the ISOs per machine and burning per
>   machine DVDs ...

this matters a little bit more because a script to create a custom DVD 
image on the fly is more difficult.

however, I think this is a significantly less important target, 
specifically because these are read-only system images.

but if someone really cares about this, they just need to create a stack 
of slightly different DVDs. if this can be batched up and automated it's 
not that big a deal. the DVDs don't really need to be per-machine, just a 
variety of them.

David Lang

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 18:05                 ` Linus Torvalds
@ 2011-05-27 19:15                   ` Vivek Goyal
  2011-05-27 21:37                   ` H. Peter Anvin
  2011-05-28 12:18                   ` Ingo Molnar
  2 siblings, 0 replies; 95+ messages in thread
From: Vivek Goyal @ 2011-05-27 19:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: H. Peter Anvin, Ingo Molnar, Dan Rosenberg, Rafael J. Wysocki,
	Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	adobriyan, penberg, Arjan van de Ven, Andrew Morton,
	Valdis.Kletnieks, pageexec

On Fri, May 27, 2011 at 11:05:07AM -0700, Linus Torvalds wrote:
> On Fri, May 27, 2011 at 10:53 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> >
> > That doesn't solve any problems with the memory map.
> 
> Actually, it does.
> 
> You can load the kernel at the same virtual address we always load it,
> and/or perhaps shift it up by just small amounts (ie "single pages"
> rather than "ten bits worth of pages")
> 
> And then rely on the fact that you mixed up symbols in other ways.
> 
> "Look ma, no need to worry about memory map". At least no more than we do now.
> 
> Put another way: think about our /proc/iomem right now:
> 
>   00100000-bdc6ffff : System RAM
>     01000000-016bdced : Kernel code
>     016bdcee-01ca8b7f : Kernel data
>     01d36000-01de2fff : Kernel bss
> 
> with the "shift kernel up at load-time", the above information is
> suddenly very scary, because the "Kernel code" part is magically
> important.
> 
> In contrast, if your randomization depends on just relinking things a
> bit differently, you don't really give out any of the random
> information in /proc/iomem. Nor does it affect the load address and
> the e820 memory map.
> 
> And, in fact, it does give you way more bits of randomness to play
> around with the text addresses.

I am wondering what happens to crash analysis tools if per system 
virtual addresses are shifted by some offset. I guess tools like
"crash" can adjust to this by looking at vmcore ELF headers but
I think gdb does not expect change of virtual addresses.

That would essentially mean that apart from vmcore one shall have to
store the vmlinux file also from the system crashed. Currently we don't
have to save vmlinux. In fact for analysis we can install distro provided
debug compiled vmlinux later and just need to get the vmcore file from
crashed system and do the analysis.

So IIUC, with above model, I guess "crash" should be able to adjust
to it quickly but gdb will have issues.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 18:05                 ` Linus Torvalds
  2011-05-27 19:15                   ` Vivek Goyal
@ 2011-05-27 21:37                   ` H. Peter Anvin
  2011-05-27 23:51                     ` H. Peter Anvin
  2011-05-28 12:18                   ` Ingo Molnar
  2 siblings, 1 reply; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-27 21:37 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Dan Rosenberg, Rafael J. Wysocki, Tony Luck,
	linux-kernel, davej, kees.cook, davem, eranian, adobriyan,
	penberg, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec

On 05/27/2011 11:05 AM, Linus Torvalds wrote:
> 
> You can load the kernel at the same virtual address we always load it,
> and/or perhaps shift it up by just small amounts (ie "single pages"
> rather than "ten bits worth of pages")
> 
> And then rely on the fact that you mixed up symbols in other ways.
> 

OK, here is a bat-shit-crazy idea... an all-module kernel where nothing
except init code is prelinked at all.

If we could modularize the core code we could have init code load the
modules at all kinds of random addresses; they wouldn't even need to be
contiguous in memory, and since we'd have full access to the memory
layout at that point, we can randomize the **** out of *everything*.

	-hpa

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 18:17                 ` Ingo Molnar
  2011-05-27 18:43                   ` Kees Cook
  2011-05-27 18:48                   ` david
@ 2011-05-27 21:51                   ` Olivier Galibert
  2011-05-27 22:11                     ` Valdis.Kletnieks
                                       ` (2 more replies)
  2 siblings, 3 replies; 95+ messages in thread
From: Olivier Galibert @ 2011-05-27 21:51 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Dan Rosenberg, Rafael J. Wysocki, Tony Luck,
	linux-kernel, davej, kees.cook, davem, eranian, adobriyan,
	penberg, hpa, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec

On Fri, May 27, 2011 at 08:17:24PM +0200, Ingo Molnar wrote:
>  - A root exploit will still not give away the location of the
>    kernel (assuming module loading has been disabled after bootup),
>    so a rootkit cannot be installed 'silently' on the system, into
>    RAM only, evading most offline-storage-checking tools.
> 
>    With static linking this is not possible: reading the kernel image
>    as root trivially exposes the kernel's location.

There's something I don't get there.  If you managed to escalate your
priviledges enough that you have physical ram access, there's a
billion things you can do to find the kernel, including vector
tracing, pattern matching, looking at the page tables, etc.

What am I missing?

  OG.

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 21:51                   ` Olivier Galibert
@ 2011-05-27 22:11                     ` Valdis.Kletnieks
  2011-05-28  0:50                     ` H. Peter Anvin
  2011-05-28  6:32                     ` Ingo Molnar
  2 siblings, 0 replies; 95+ messages in thread
From: Valdis.Kletnieks @ 2011-05-27 22:11 UTC (permalink / raw)
  To: Olivier Galibert
  Cc: Ingo Molnar, Linus Torvalds, Dan Rosenberg, Rafael J. Wysocki,
	Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	adobriyan, penberg, hpa, Arjan van de Ven, Andrew Morton,
	pageexec

[-- Attachment #1: Type: text/plain, Size: 917 bytes --]

On Fri, 27 May 2011 23:51:23 +0200, Olivier Galibert said:
> On Fri, May 27, 2011 at 08:17:24PM +0200, Ingo Molnar wrote:
> >  - A root exploit will still not give away the location of the
> >    kernel (assuming module loading has been disabled after bootup),
> >    so a rootkit cannot be installed 'silently' on the system, into
> >    RAM only, evading most offline-storage-checking tools.
> > 
> >    With static linking this is not possible: reading the kernel image
> >    as root trivially exposes the kernel's location.
> 
> There's something I don't get there.  If you managed to escalate your
> priviledges enough that you have physical ram access, there's a
> billion things you can do to find the kernel, including vector
> tracing, pattern matching, looking at the page tables, etc.

Oh, you mean all the tricks that people do now to patch the syscall table
once we hid it so they couldn't patch it? :)

[-- Attachment #2: Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 21:37                   ` H. Peter Anvin
@ 2011-05-27 23:51                     ` H. Peter Anvin
  0 siblings, 0 replies; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-27 23:51 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ingo Molnar, Dan Rosenberg, Rafael J. Wysocki, Tony Luck,
	linux-kernel, davej, kees.cook, davem, eranian, adobriyan,
	penberg, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec

On 05/27/2011 02:37 PM, H. Peter Anvin wrote:
> On 05/27/2011 11:05 AM, Linus Torvalds wrote:
>>
>> You can load the kernel at the same virtual address we always load it,
>> and/or perhaps shift it up by just small amounts (ie "single pages"
>> rather than "ten bits worth of pages")
>>
>> And then rely on the fact that you mixed up symbols in other ways.
>>
> 
> OK, here is a bat-shit-crazy idea... an all-module kernel where nothing
> except init code is prelinked at all.
> 
> If we could modularize the core code we could have init code load the
> modules at all kinds of random addresses; they wouldn't even need to be
> contiguous in memory, and since we'd have full access to the memory
> layout at that point, we can randomize the **** out of *everything*.
> 

Thinking about it some more, it might not be that crazy.  Consider the
following notion: the kernel payload, as delivered by the decompressor,
contains the init code, plus a set of modules, which can be ELF modules,
but don't have to be (but since we already have code to load and link
ELF modules it is probably be the best choice.)

After we initialize the system enough to have a memory map, we can pick
a random place for each module, copy it in place, fix up the
relocations, and free the original location.

If we are exceptionally clever, which of course we are, we could even
have these modules linked to their initial location and fix up
references in running code, that way init code could still call module
code, as long as it doesn't stash away pointers to module data.

	-hpa

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 21:51                   ` Olivier Galibert
  2011-05-27 22:11                     ` Valdis.Kletnieks
@ 2011-05-28  0:50                     ` H. Peter Anvin
  2011-05-28  6:32                     ` Ingo Molnar
  2 siblings, 0 replies; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-28  0:50 UTC (permalink / raw)
  To: Olivier Galibert
  Cc: Ingo Molnar, Linus Torvalds, Dan Rosenberg, Rafael J. Wysocki,
	Tony Luck, linux-kernel, davej, kees.cook, davem, eranian,
	adobriyan, penberg, Arjan van de Ven, Andrew Morton,
	Valdis.Kletnieks, pageexec

On 05/27/2011 02:51 PM, Olivier Galibert wrote:
> On Fri, May 27, 2011 at 08:17:24PM +0200, Ingo Molnar wrote:
>>  - A root exploit will still not give away the location of the
>>    kernel (assuming module loading has been disabled after bootup),
>>    so a rootkit cannot be installed 'silently' on the system, into
>>    RAM only, evading most offline-storage-checking tools.
>>
>>    With static linking this is not possible: reading the kernel image
>>    as root trivially exposes the kernel's location.
> 
> There's something I don't get there.  If you managed to escalate your
> priviledges enough that you have physical ram access, there's a
> billion things you can do to find the kernel, including vector
> tracing, pattern matching, looking at the page tables, etc.
> 
> What am I missing?
> 

Just makes it harder to automate an attack, and more likely that it will
fail.  It's an arms race, of course.

	-hpa

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 21:51                   ` Olivier Galibert
  2011-05-27 22:11                     ` Valdis.Kletnieks
  2011-05-28  0:50                     ` H. Peter Anvin
@ 2011-05-28  6:32                     ` Ingo Molnar
  2 siblings, 0 replies; 95+ messages in thread
From: Ingo Molnar @ 2011-05-28  6:32 UTC (permalink / raw)
  To: Olivier Galibert
  Cc: Linus Torvalds, Dan Rosenberg, Rafael J. Wysocki, Tony Luck,
	linux-kernel, davej, kees.cook, davem, eranian, adobriyan,
	penberg, hpa, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec


* Olivier Galibert <galibert@pobox.com> wrote:

> On Fri, May 27, 2011 at 08:17:24PM +0200, Ingo Molnar wrote:
> >  - A root exploit will still not give away the location of the
> >    kernel (assuming module loading has been disabled after bootup),
> >    so a rootkit cannot be installed 'silently' on the system, into
> >    RAM only, evading most offline-storage-checking tools.
> > 
> >    With static linking this is not possible: reading the kernel image
> >    as root trivially exposes the kernel's location.
> 
> There's something I don't get there.  If you managed to escalate your
> priviledges enough that you have physical ram access, there's a
> billion things you can do to find the kernel, including vector
> tracing, pattern matching, looking at the page tables, etc.
>
> What am I missing?

You are missing that it's not unrealistic to make the
"root does not have physical RAM access" condition true
on a system.

CONFIG_STRICT_DEVMEM=y will go a long way already, enabled
on most distros these days:

 $ grep DEVMEM $(rpm -ql kernel-2.6.38-0.rc7.git2.3.fc16.x86_64 | grep boot/config)
 CONFIG_STRICT_DEVMEM=y

Combined with:

 echo 1 > /proc/sys/kernel/modules_disabled

( Which cannot be turned back on once turned off after essential 
  modules have loaded. )

Admins do not actually need access to physical RAM, nor do they need
the ability to binary patch kernel code, so it's not unrealistic to
do this in distros.

There can be a few more vectors to access physical RAM but they can 
be controlled as well.

This will already force a reboot (or a wait for a regular reboot) by 
the attacker to install rootkit level code.

But yes, if root controls RAM then it's obviously game over: even 
with randomization RAM can be scanned for kernel image signatures, 
kernel code can be inserted or system call table patched - q.e.d.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-27 18:05                 ` Linus Torvalds
  2011-05-27 19:15                   ` Vivek Goyal
  2011-05-27 21:37                   ` H. Peter Anvin
@ 2011-05-28 12:18                   ` Ingo Molnar
  2011-05-29  1:13                     ` H. Peter Anvin
  2 siblings, 1 reply; 95+ messages in thread
From: Ingo Molnar @ 2011-05-28 12:18 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: H. Peter Anvin, Dan Rosenberg, Rafael J. Wysocki, Tony Luck,
	linux-kernel, davej, kees.cook, davem, eranian, adobriyan,
	penberg, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Fri, May 27, 2011 at 10:53 AM, H. Peter Anvin <hpa@zytor.com> wrote:
> >
> > That doesn't solve any problems with the memory map.
> 
> Actually, it does.
> 
> You can load the kernel at the same virtual address we always load 
> it, and/or perhaps shift it up by just small amounts (ie "single 
> pages" rather than "ten bits worth of pages")

Note that if we do not limit it to just 'a few pages' then padding 
the randomization space into the kernel image:

 *also solves the memory map problem in the dynamic randomization case*

Having half a megabyte of '__init buffer' at the beginning or end of 
the kernel image is no big deal, it's more than enough for good 
randomization and makes the whole thing image-loader invariant: we 
can freely shift the 'real' kernel image within this larger boundary 
without consulting RAM maps.

And yes, you are right that smarter randomization like reordering of 
functions is probably more feasible with a static method - but i'm 
not sure we'd like to reorder functions: they are often ordered by 
importance within .c files, hence they are often ordered by cache 
hotness, so keeping them together makes sense to optimize icache 
footprint.

Further note that should anyone want to randomize the kernel position 
within a larger range, memory maps can still be consulted - but 
that's an optional enhancement, not a design requirement.

Note that such a larger range of randomization is not possible with 
the static install-time randomization method, as it needs the consult 
the memory maps on bootup.

So while i agree with you that install-time randomization has unique 
properties, i do not agree that all of those unique properties are 
advantages and thus i do not think that the case for static 
randomization is nearly as clear-cut as you made it appear.

Furthermore, the two main complications of dynamic randomizations 
that you highlighted are not really fundamental complications IMO:

 - the memory map consulation complexity can be completely eliminated
   in the dynamic randomization case as well

 - the hibernation complication is overstated i think: if on
   hibernation we save the randomization offset then the thawed 
   kernel can load at the very same address. [ We have no other
   choice anyway, pointers to the kernel image are stored all
   over the frozen image. ]

   This skips re-randomization across hibernation but that's ok:
   it's the functional equivalent of suspend-to-RAM.

Btw., there's another advantage of kernel image randomization in 
general that i have not mentioned before:

 - in addition to randomizing the kernel load physical image address, 
   on 64-bit x86 we could independently randomize the *virtual* 
   address of the kernel as well: within a rather large, 2GB address 
   space.

   This makes the very first step of buffer overflow (and pointer 
   overwrite) attacks very hard: they'd have to find the right 
   executable needle within a 2GB haystack.

Combined with SMEP this needle is the *only* place where a kernel 
mode exploit can execute. [*]

This kind of large-scale virtual address randomization could be 
performed both dynamically (boot time) and statically (install time).

Thanks,

	Ingo

[*] Assuming we get around sorting out the first 1MB compatibility
    constraints that force us to turn off NX there currently, and 
    review the pagetables for all remaining system mode mapped 
    executable pages.

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-28 12:18                   ` Ingo Molnar
@ 2011-05-29  1:13                     ` H. Peter Anvin
  2011-05-29 12:47                       ` Ingo Molnar
  0 siblings, 1 reply; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-29  1:13 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Dan Rosenberg, Rafael J. Wysocki, Tony Luck,
	linux-kernel, davej, kees.cook, davem, eranian, adobriyan,
	penberg, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec

On 05/28/2011 05:18 AM, Ingo Molnar wrote:
> 
> Having half a megabyte of '__init buffer' at the beginning or end of 
> the kernel image is no big deal, it's more than enough for good 
> randomization and makes the whole thing image-loader invariant: we 
> can freely shift the 'real' kernel image within this larger boundary 
> without consulting RAM maps.
> 

Sure, but you're also blowing any attempt at PMD alignment to kingdom come.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-29  1:13                     ` H. Peter Anvin
@ 2011-05-29 12:47                       ` Ingo Molnar
  2011-05-29 18:19                         ` H. Peter Anvin
  0 siblings, 1 reply; 95+ messages in thread
From: Ingo Molnar @ 2011-05-29 12:47 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Linus Torvalds, Dan Rosenberg, Rafael J. Wysocki, Tony Luck,
	linux-kernel, davej, kees.cook, davem, eranian, adobriyan,
	penberg, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec


* H. Peter Anvin <hpa@zytor.com> wrote:

> On 05/28/2011 05:18 AM, Ingo Molnar wrote:
> > 
> > Having half a megabyte of '__init buffer' at the beginning or end 
> > of the kernel image is no big deal, it's more than enough for 
> > good randomization and makes the whole thing image-loader 
> > invariant: we can freely shift the 'real' kernel image within 
> > this larger boundary without consulting RAM maps.
> 
> Sure, but you're also blowing any attempt at PMD alignment to 
> kingdom come.

Do you mean we'd not start at a 2MB boundary and thus would waste on 
average an about 0.125 worth of huge-TLB cache entry?

That does not look like a very big issue to me - but maybe i'm 
missing something and you mean something else.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-29 12:47                       ` Ingo Molnar
@ 2011-05-29 18:19                         ` H. Peter Anvin
  2011-05-29 18:44                           ` Ingo Molnar
  0 siblings, 1 reply; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-29 18:19 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Dan Rosenberg, Rafael J. Wysocki, Tony Luck,
	linux-kernel, davej, kees.cook, davem, eranian, adobriyan,
	penberg, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec

On 05/29/2011 05:47 AM, Ingo Molnar wrote:
> 
> Do you mean we'd not start at a 2MB boundary and thus would waste on 
> average an about 0.125 worth of huge-TLB cache entry?
> 
> That does not look like a very big issue to me - but maybe i'm 
> missing something and you mean something else.
> 

The problem is that because of the misalignment, and whatever falls on
the other side of that memory boundary we might end up having to
fracture the 2 MB page into 4K pages.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-29 18:19                         ` H. Peter Anvin
@ 2011-05-29 18:44                           ` Ingo Molnar
  2011-05-29 18:52                             ` H. Peter Anvin
  0 siblings, 1 reply; 95+ messages in thread
From: Ingo Molnar @ 2011-05-29 18:44 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Linus Torvalds, Dan Rosenberg, Rafael J. Wysocki, Tony Luck,
	linux-kernel, davej, kees.cook, davem, eranian, adobriyan,
	penberg, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec


* H. Peter Anvin <hpa@zytor.com> wrote:

> On 05/29/2011 05:47 AM, Ingo Molnar wrote:
> > 
> > Do you mean we'd not start at a 2MB boundary and thus would waste on 
> > average an about 0.125 worth of huge-TLB cache entry?
> > 
> > That does not look like a very big issue to me - but maybe i'm 
> > missing something and you mean something else.
> > 
> 
> The problem is that because of the misalignment, and whatever falls 
> on the other side of that memory boundary we might end up having to 
> fracture the 2 MB page into 4K pages.

We already have that kind of fragmentation anyway, due to NX and due 
to the readonly area. Randomization does not really make that 
situation much worse.

But the thing is, we could fully eliminate all those disadvantages on 
64-bit x86:

We could put a 2MB hole between end of text (end of X) and start of 
readonly data (start of NX), and another 2MB hole between end of 
readonly and start of data.

That way we'd have:

 - the low alias is mapped NX as well, so the whole area and 
   surrounding pages are 2MB aligned. The 'holes' are freed up as 
   __initmem so not wasted.

 - the high alias will have three areas:

    - the    text area, which is 2MB mapped as X
    - the ro-data area, which is 2MB mapped as NX-RO
    - the    data area, which is 2MB mapped as NX-RW

because there's at least 2MB of distance between end of text and 
start of data there's a guarantee that both will be fully 2MB mapped.

Btw., we might want to do this regardless of randomization, for 
performance reasons: right now the NX and readonly area fragments the 
2MB mapping around the kernel text already, into 4K mappings.

Hm?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-29 18:44                           ` Ingo Molnar
@ 2011-05-29 18:52                             ` H. Peter Anvin
  2011-05-29 19:56                               ` Ingo Molnar
  0 siblings, 1 reply; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-29 18:52 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Dan Rosenberg, Rafael J. Wysocki, Tony Luck,
	linux-kernel, davej, kees.cook, davem, eranian, adobriyan,
	penberg, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec

On 05/29/2011 11:44 AM, Ingo Molnar wrote:
> 
> We could put a 2MB hole between end of text (end of X) and start of 
> readonly data (start of NX), and another 2MB hole between end of 
> readonly and start of data.
> 

It still means you have memory which is X-mapped when it doesn't need to
be, since there will be RAM in that region.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-29 18:52                             ` H. Peter Anvin
@ 2011-05-29 19:56                               ` Ingo Molnar
  0 siblings, 0 replies; 95+ messages in thread
From: Ingo Molnar @ 2011-05-29 19:56 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Linus Torvalds, Dan Rosenberg, Rafael J. Wysocki, Tony Luck,
	linux-kernel, davej, kees.cook, davem, eranian, adobriyan,
	penberg, Arjan van de Ven, Andrew Morton, Valdis.Kletnieks,
	pageexec


* H. Peter Anvin <hpa@zytor.com> wrote:

> On 05/29/2011 11:44 AM, Ingo Molnar wrote:
> > 
> > We could put a 2MB hole between end of text (end of X) and start of 
> > readonly data (start of NX), and another 2MB hole between end of 
> > readonly and start of data.
> > 
> 
> It still means you have memory which is X-mapped when it doesn't need to
> be, since there will be RAM in that region.

But it ought to be rather harmless in this particular case, because 
the high alias addresses are all randomized!

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-26 20:39 ` Dan Rosenberg
  2011-05-27  7:15   ` Ingo Molnar
@ 2011-05-31 16:52   ` Matthew Garrett
  2011-05-31 18:40     ` H. Peter Anvin
  1 sibling, 1 reply; 95+ messages in thread
From: Matthew Garrett @ 2011-05-31 16:52 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Tony Luck, linux-kernel, kees.cook, davej, torvalds, adobriyan,
	eranian, penberg, davem, Arjan van de Ven, hpa, Valdis.Kletnieks,
	Andrew Morton, pageexec, Ingo Molnar, Vivek Goyal

On Thu, May 26, 2011 at 04:39:27PM -0400, Dan Rosenberg wrote:

> 1. I'm nearly finished a first draft of code to parse the BIOS E820
> memory map to determine where it's safe to place the randomized kernel.
> This code accounts for overlapping regions, as well as potential
> conflicts in region types (free vs. reserved, etc.), in favor of
> non-free types.  The end result is, I'll have a reasonable upper bound.

The BIOS E820 map, or the kernel representation? In either case, this 
isn't going to work well with EFI. There are regions that will be marked 
as available in the E820 map that we *mustn't* touch until we've entered 
EFI virtual mode.

(This is, clearly, insane).

One other thing is that when we've entered EFI virtual mode we'll have 
remapped various parts of the EFI memory map into virtual address space. 
There's no way to update these mappings later. If we want kexec to work 
then there has to be a mechanism for ensuring that these mappings can be 
provided to the second kernel and for it to preserve them.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-31 16:52   ` Matthew Garrett
@ 2011-05-31 18:40     ` H. Peter Anvin
  2011-05-31 18:51       ` Matthew Garrett
  0 siblings, 1 reply; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-31 18:40 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: Dan Rosenberg, Tony Luck, linux-kernel, kees.cook, davej,
	torvalds, adobriyan, eranian, penberg, davem, Arjan van de Ven,
	Valdis.Kletnieks, Andrew Morton, pageexec, Ingo Molnar,
	Vivek Goyal

On 05/31/2011 09:52 AM, Matthew Garrett wrote:
> On Thu, May 26, 2011 at 04:39:27PM -0400, Dan Rosenberg wrote:
> 
>> 1. I'm nearly finished a first draft of code to parse the BIOS E820
>> memory map to determine where it's safe to place the randomized kernel.
>> This code accounts for overlapping regions, as well as potential
>> conflicts in region types (free vs. reserved, etc.), in favor of
>> non-free types.  The end result is, I'll have a reasonable upper bound.
> 
> The BIOS E820 map, or the kernel representation? In either case, this 
> isn't going to work well with EFI. There are regions that will be marked 
> as available in the E820 map that we *mustn't* touch until we've entered 
> EFI virtual mode.
> 
> (This is, clearly, insane).
> 

I believe we could (should!) mark them reserved, not available, in the
E820 map and free them later.

	-hpa

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-31 18:40     ` H. Peter Anvin
@ 2011-05-31 18:51       ` Matthew Garrett
  2011-05-31 19:03         ` Dan Rosenberg
  0 siblings, 1 reply; 95+ messages in thread
From: Matthew Garrett @ 2011-05-31 18:51 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Dan Rosenberg, Tony Luck, linux-kernel, kees.cook, davej,
	torvalds, adobriyan, eranian, penberg, davem, Arjan van de Ven,
	Valdis.Kletnieks, Andrew Morton, pageexec, Ingo Molnar,
	Vivek Goyal

On Tue, May 31, 2011 at 11:40:13AM -0700, H. Peter Anvin wrote:
> On 05/31/2011 09:52 AM, Matthew Garrett wrote:
> > The BIOS E820 map, or the kernel representation? In either case, this 
> > isn't going to work well with EFI. There are regions that will be marked 
> > as available in the E820 map that we *mustn't* touch until we've entered 
> > EFI virtual mode.
> > 
> > (This is, clearly, insane).
> > 
> 
> I believe we could (should!) mark them reserved, not available, in the
> E820 map and free them later.

That was my original approach, but it requires that the bootloader be 
modified and it turns out that it's a lot harder to hand reserved 
regions back to the OS than it is to just reserve it in-kernel. The 
complete inflexibility of e820 is massively unhelpful here. It's just 
not possible to represent all of the EFI memory map data in it.

-- 
Matthew Garrett | mjg59@srcf.ucam.org

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-31 18:51       ` Matthew Garrett
@ 2011-05-31 19:03         ` Dan Rosenberg
  2011-05-31 19:07           ` H. Peter Anvin
                             ` (2 more replies)
  0 siblings, 3 replies; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-31 19:03 UTC (permalink / raw)
  To: Matthew Garrett
  Cc: H. Peter Anvin, Tony Luck, linux-kernel, kees.cook, davej,
	torvalds, adobriyan, eranian, penberg, davem, Arjan van de Ven,
	Valdis.Kletnieks, Andrew Morton, pageexec, Ingo Molnar,
	Vivek Goyal

On Tue, 2011-05-31 at 19:51 +0100, Matthew Garrett wrote:
> On Tue, May 31, 2011 at 11:40:13AM -0700, H. Peter Anvin wrote:
> > On 05/31/2011 09:52 AM, Matthew Garrett wrote:
> > > The BIOS E820 map, or the kernel representation? In either case, this 
> > > isn't going to work well with EFI. There are regions that will be marked 
> > > as available in the E820 map that we *mustn't* touch until we've entered 
> > > EFI virtual mode.
> > > 
> > > (This is, clearly, insane).
> > > 
> > 
> > I believe we could (should!) mark them reserved, not available, in the
> > E820 map and free them later.
> 
> That was my original approach, but it requires that the bootloader be 
> modified and it turns out that it's a lot harder to hand reserved 
> regions back to the OS than it is to just reserve it in-kernel. The 
> complete inflexibility of e820 is massively unhelpful here. It's just 
> not possible to represent all of the EFI memory map data in it.
> 

Just for the record, I've put this patch on hold until there's some more
consensus about whether boot-time randomization of the physical kernel
address is the best approach.  There are some other potential issues
that haven't been brought up yet publicly, such as the possibility of
local attackers performing cache timing attacks to find the kernel image
location at runtime, which may make traditional ASLR somewhat pointless
regardless (except in the case of remote attackers, I suppose).  Perhaps
HPA's suggestion of further modularizing the kernel would have some
advantages in this regard.

-Dan

> -- 
> Matthew Garrett | mjg59@srcf.ucam.org



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-31 19:03         ` Dan Rosenberg
@ 2011-05-31 19:07           ` H. Peter Anvin
  2011-05-31 19:50           ` Ingo Molnar
  2011-05-31 19:55           ` Ingo Molnar
  2 siblings, 0 replies; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-31 19:07 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Matthew Garrett, Tony Luck, linux-kernel, kees.cook, davej,
	torvalds, adobriyan, eranian, penberg, davem, Arjan van de Ven,
	Valdis.Kletnieks, Andrew Morton, pageexec, Ingo Molnar,
	Vivek Goyal

On 05/31/2011 12:03 PM, Dan Rosenberg wrote:
> 
> Just for the record, I've put this patch on hold until there's some more
> consensus about whether boot-time randomization of the physical kernel
> address is the best approach.  There are some other potential issues
> that haven't been brought up yet publicly, such as the possibility of
> local attackers performing cache timing attacks to find the kernel image
> location at runtime, which may make traditional ASLR somewhat pointless
> regardless (except in the case of remote attackers, I suppose).  Perhaps
> HPA's suggestion of further modularizing the kernel would have some
> advantages in this regard.
> 

I'm probably going to implement the whole-image randomization as an
option in the Syslinux bootloader; it is a *lot* easier to do this
correctly in the bootloader.

	-hpa

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-31 19:03         ` Dan Rosenberg
  2011-05-31 19:07           ` H. Peter Anvin
@ 2011-05-31 19:50           ` Ingo Molnar
  2011-05-31 19:55           ` Ingo Molnar
  2 siblings, 0 replies; 95+ messages in thread
From: Ingo Molnar @ 2011-05-31 19:50 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Matthew Garrett, H. Peter Anvin, Tony Luck, linux-kernel,
	kees.cook, davej, torvalds, adobriyan, eranian, penberg, davem,
	Arjan van de Ven, Valdis.Kletnieks, Andrew Morton, pageexec,
	Vivek Goyal


* Dan Rosenberg <drosenberg@vsecurity.com> wrote:

> [...] the possibility of local attackers performing cache timing 
> attacks to find the kernel image location at runtime, [...]

How would these work, roughly?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-31 19:03         ` Dan Rosenberg
  2011-05-31 19:07           ` H. Peter Anvin
  2011-05-31 19:50           ` Ingo Molnar
@ 2011-05-31 19:55           ` Ingo Molnar
  2011-05-31 20:15             ` H. Peter Anvin
  2011-05-31 20:17             ` Dan Rosenberg
  2 siblings, 2 replies; 95+ messages in thread
From: Ingo Molnar @ 2011-05-31 19:55 UTC (permalink / raw)
  To: Dan Rosenberg
  Cc: Matthew Garrett, H. Peter Anvin, Tony Luck, linux-kernel,
	kees.cook, davej, torvalds, adobriyan, eranian, penberg, davem,
	Arjan van de Ven, Valdis.Kletnieks, Andrew Morton, pageexec,
	Vivek Goyal


* Dan Rosenberg <drosenberg@vsecurity.com> wrote:

> Just for the record, I've put this patch on hold until there's some 
> more consensus about whether boot-time randomization of the 
> physical kernel address is the best approach. [...]

Well, if you use the suggestion i made: to skip the e820 map fiddling 
altogether and just allocate half a megabyte of 'hole' at the end of 
the kernel image - which would allow the kernel to be randomized 
freely upwards by 0-128 pages - then the 'dynamic' versus 'static' 
solution could be used at once!

The 'static' method would use the same hole, just at install time, 
while the 'dynamic' method would use it during bootup.

Also, if this method is used then most of the controversy about the 
dynamic approach goes away (which was the memory maps interpretation 
fragility).

Your last patch would need only minor modifications to get the hole 
added: you'd need to add the tail-hole in the linker map:

   arch/x86/kernel/vmlinux.lds.S

So ... could you *please* not shelf this idea just because people 
used lkml for what it was invented: argued with each other rather 
forcefully? :-)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-31 19:55           ` Ingo Molnar
@ 2011-05-31 20:15             ` H. Peter Anvin
  2011-05-31 20:27               ` Ingo Molnar
  2011-05-31 20:17             ` Dan Rosenberg
  1 sibling, 1 reply; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-31 20:15 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Dan Rosenberg, Matthew Garrett, Tony Luck, linux-kernel,
	kees.cook, davej, torvalds, adobriyan, eranian, penberg, davem,
	Arjan van de Ven, Valdis.Kletnieks, Andrew Morton, pageexec,
	Vivek Goyal

On 05/31/2011 12:55 PM, Ingo Molnar wrote:
> 
> So ... could you *please* not shelf this idea just because people 
> used lkml for what it was invented: argued with each other rather 
> forcefully? :-)
> 

The real issue is that if it can be (semi)trivially bypassed, then there
may not be much reason to do it.

Other than that, Ingo's idea at least have the merit that it would break
only older bootloaders doing things wrong.

	-hpa

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-31 19:55           ` Ingo Molnar
  2011-05-31 20:15             ` H. Peter Anvin
@ 2011-05-31 20:17             ` Dan Rosenberg
  1 sibling, 0 replies; 95+ messages in thread
From: Dan Rosenberg @ 2011-05-31 20:17 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Matthew Garrett, H. Peter Anvin, Tony Luck, linux-kernel,
	kees.cook, davej, torvalds, adobriyan, eranian, penberg, davem,
	Arjan van de Ven, Valdis.Kletnieks, Andrew Morton, pageexec,
	Vivek Goyal

On Tue, 2011-05-31 at 21:55 +0200, Ingo Molnar wrote:
> * Dan Rosenberg <drosenberg@vsecurity.com> wrote:
> 
> > Just for the record, I've put this patch on hold until there's some 
> > more consensus about whether boot-time randomization of the 
> > physical kernel address is the best approach. [...]
> 
> Well, if you use the suggestion i made: to skip the e820 map fiddling 
> altogether and just allocate half a megabyte of 'hole' at the end of 
> the kernel image - which would allow the kernel to be randomized 
> freely upwards by 0-128 pages - then the 'dynamic' versus 'static' 
> solution could be used at once!
> 
> The 'static' method would use the same hole, just at install time, 
> while the 'dynamic' method would use it during bootup.
> 
> Also, if this method is used then most of the controversy about the 
> dynamic approach goes away (which was the memory maps interpretation 
> fragility).
> 
> Your last patch would need only minor modifications to get the hole 
> added: you'd need to add the tail-hole in the linker map:
> 
>    arch/x86/kernel/vmlinux.lds.S
> 
> So ... could you *please* not shelf this idea just because people 
> used lkml for what it was invented: argued with each other rather 
> forcefully? :-)
> 

Don't worry, I haven't shelved the idea...I just wanted to see more of
the on-going conversation before investing a substantial amount of time
on a potentially infeasible solution.  I'll give this approach a shot.

-Dan

> Thanks,
> 
> 	Ingo



^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-31 20:15             ` H. Peter Anvin
@ 2011-05-31 20:27               ` Ingo Molnar
  2011-05-31 20:30                 ` H. Peter Anvin
  0 siblings, 1 reply; 95+ messages in thread
From: Ingo Molnar @ 2011-05-31 20:27 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Dan Rosenberg, Matthew Garrett, Tony Luck, linux-kernel,
	kees.cook, davej, torvalds, adobriyan, eranian, penberg, davem,
	Arjan van de Ven, Valdis.Kletnieks, Andrew Morton, pageexec,
	Vivek Goyal


* H. Peter Anvin <hpa@zytor.com> wrote:

> On 05/31/2011 12:55 PM, Ingo Molnar wrote:
> > 
> > So ... could you *please* not shelf this idea just because people 
> > used lkml for what it was invented: argued with each other rather 
> > forcefully? :-)
> 
> The real issue is that if it can be (semi)trivially bypassed, then 
> there may not be much reason to do it.

Sure.

> Other than that, Ingo's idea at least have the merit that it would 
> break only older bootloaders doing things wrong.

I'm wondering, why would it break older bootloaders? It's just a 
slightly larger than usual kernel image, nothing is visible to the 
bootloader.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-31 20:27               ` Ingo Molnar
@ 2011-05-31 20:30                 ` H. Peter Anvin
  2011-06-01  6:18                   ` Ingo Molnar
  0 siblings, 1 reply; 95+ messages in thread
From: H. Peter Anvin @ 2011-05-31 20:30 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Dan Rosenberg, Matthew Garrett, Tony Luck, linux-kernel,
	kees.cook, davej, torvalds, adobriyan, eranian, penberg, davem,
	Arjan van de Ven, Valdis.Kletnieks, Andrew Morton, pageexec,
	Vivek Goyal

On 05/31/2011 01:27 PM, Ingo Molnar wrote:
> 
>> Other than that, Ingo's idea at least have the merit that it would 
>> break only older bootloaders doing things wrong.
> 
> I'm wondering, why would it break older bootloaders? It's just a 
> slightly larger than usual kernel image, nothing is visible to the 
> bootloader.
> 

Older boot loaders did not know how big the kernel image was, therefore
had no way to avoid memory space collision.  That is fixed in boot
protocol 2.10.

	-hpa

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-05-31 20:30                 ` H. Peter Anvin
@ 2011-06-01  6:18                   ` Ingo Molnar
  2011-06-01 15:44                     ` H. Peter Anvin
  0 siblings, 1 reply; 95+ messages in thread
From: Ingo Molnar @ 2011-06-01  6:18 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Dan Rosenberg, Matthew Garrett, Tony Luck, linux-kernel,
	kees.cook, davej, torvalds, adobriyan, eranian, penberg, davem,
	Arjan van de Ven, Valdis.Kletnieks, Andrew Morton, pageexec,
	Vivek Goyal


* H. Peter Anvin <hpa@zytor.com> wrote:

> On 05/31/2011 01:27 PM, Ingo Molnar wrote:
> > 
> >> Other than that, Ingo's idea at least have the merit that it would 
> >> break only older bootloaders doing things wrong.
> > 
> > I'm wondering, why would it break older bootloaders? It's just a 
> > slightly larger than usual kernel image, nothing is visible to the 
> > bootloader.
> > 
> 
> Older boot loaders did not know how big the kernel image was, 
> therefore had no way to avoid memory space collision.  That is 
> fixed in boot protocol 2.10.

But i loaded really large kernel images way back 10 years ago on 
various systems and never had any problems until the default 
allyesconfig hit a ~40 MB kernel image size limit ;-)

(which limit was in the kernel, not in the bootloader)

So yes, a large kernel image "can" be an issue with old bootloaders 
in some situations on weird machines but we don't really "break" them 
via randomization, they were broken and fragile in some situations to 
begin with.

It's fixed in any distro that cares and which would use our (not even 
released) kernel that might one day have randomization.

Is that a fair summary of the bootloader situation?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 95+ messages in thread

* Re: [RFC][PATCH] Randomize kernel base address on boot
  2011-06-01  6:18                   ` Ingo Molnar
@ 2011-06-01 15:44                     ` H. Peter Anvin
  0 siblings, 0 replies; 95+ messages in thread
From: H. Peter Anvin @ 2011-06-01 15:44 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Dan Rosenberg, Matthew Garrett, Tony Luck, linux-kernel,
	kees.cook, davej, torvalds, adobriyan, eranian, penberg, davem,
	Arjan van de Ven, Valdis.Kletnieks, Andrew Morton, pageexec,
	Vivek Goyal

On 05/31/2011 11:18 PM, Ingo Molnar wrote:
>>
>> Older boot loaders did not know how big the kernel image was, 
>> therefore had no way to avoid memory space collision.  That is 
>> fixed in boot protocol 2.10.
> 
> But i loaded really large kernel images way back 10 years ago on 
> various systems and never had any problems until the default 
> allyesconfig hit a ~40 MB kernel image size limit ;-)
> 
> (which limit was in the kernel, not in the bootloader)

But it would have depended on the target hardware!  That's the problem.

> 
> So yes, a large kernel image "can" be an issue with old bootloaders 
> in some situations on weird machines but we don't really "break" them 
> via randomization, they were broken and fragile in some situations to 
> begin with.
> 

Well, yes; and I don't think the randomization. is a signifiant problem.

> It's fixed in any distro that cares and which would use our (not even 
> released) kernel that might one day have randomization.
> 
> Is that a fair summary of the bootloader situation?

No, because I don't think Grub is fixed in any of its flavors.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 95+ messages in thread

end of thread, other threads:[~2011-06-01 15:45 UTC | newest]

Thread overview: 95+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-24 20:31 [RFC][PATCH] Randomize kernel base address on boot Dan Rosenberg
2011-05-24 21:02 ` Ingo Molnar
2011-05-24 22:55   ` Dan Rosenberg
2011-05-24 21:16 ` Ingo Molnar
2011-05-24 23:00   ` Dan Rosenberg
2011-05-25 11:23     ` Ingo Molnar
2011-05-25 14:20       ` Dan Rosenberg
2011-05-25 14:29         ` Ingo Molnar
2011-05-24 23:06   ` H. Peter Anvin
2011-05-25 14:03     ` Dan Rosenberg
2011-05-25 14:14       ` Ingo Molnar
2011-05-25 15:48       ` H. Peter Anvin
2011-05-25 16:15         ` Dan Rosenberg
2011-05-25 16:24           ` H. Peter Anvin
2011-05-24 21:46 ` Brian Gerst
2011-05-24 23:01   ` Dan Rosenberg
2011-05-24 22:31 ` H. Peter Anvin
2011-05-24 23:04   ` Dan Rosenberg
2011-05-24 23:07     ` H. Peter Anvin
2011-05-24 23:34       ` Dan Rosenberg
2011-05-24 23:36         ` H. Peter Anvin
2011-05-24 23:14   ` H. Peter Anvin
2011-05-24 23:08 ` Dan Rosenberg
2011-05-25  2:05   ` Dan Rosenberg
2011-05-26 20:01 ` Vivek Goyal
2011-05-26 20:06   ` Dan Rosenberg
2011-05-26 20:16   ` Valdis.Kletnieks
2011-05-26 20:31     ` Vivek Goyal
2011-05-27  9:36       ` Ingo Molnar
2011-05-26 20:35 ` Vivek Goyal
2011-05-26 20:40   ` Vivek Goyal
2011-05-26 20:44     ` Dan Rosenberg
2011-05-26 20:55       ` Vivek Goyal
2011-05-27  9:38         ` Ingo Molnar
2011-05-27 13:07           ` Vivek Goyal
2011-05-27 13:38             ` Ingo Molnar
2011-05-27 13:13       ` Vivek Goyal
2011-05-27 13:21         ` Dan Rosenberg
2011-05-27 13:46           ` Ingo Molnar
2011-05-27 13:50           ` Vivek Goyal
2011-05-26 20:39 ` Dan Rosenberg
2011-05-27  7:15   ` Ingo Molnar
2011-05-31 16:52   ` Matthew Garrett
2011-05-31 18:40     ` H. Peter Anvin
2011-05-31 18:51       ` Matthew Garrett
2011-05-31 19:03         ` Dan Rosenberg
2011-05-31 19:07           ` H. Peter Anvin
2011-05-31 19:50           ` Ingo Molnar
2011-05-31 19:55           ` Ingo Molnar
2011-05-31 20:15             ` H. Peter Anvin
2011-05-31 20:27               ` Ingo Molnar
2011-05-31 20:30                 ` H. Peter Anvin
2011-06-01  6:18                   ` Ingo Molnar
2011-06-01 15:44                     ` H. Peter Anvin
2011-05-31 20:17             ` Dan Rosenberg
2011-05-26 22:18 ` Rafael J. Wysocki
2011-05-26 22:32   ` H. Peter Anvin
2011-05-27  0:26     ` Dan Rosenberg
2011-05-27 16:21       ` Rafael J. Wysocki
2011-05-27  2:45     ` Dave Jones
2011-05-27  9:40       ` Ingo Molnar
2011-05-27 16:11         ` Rafael J. Wysocki
2011-05-27 16:07     ` Rafael J. Wysocki
2011-05-27 15:42   ` Linus Torvalds
2011-05-27 16:11     ` Dan Rosenberg
2011-05-27 17:00     ` Ingo Molnar
2011-05-27 17:06       ` H. Peter Anvin
2011-05-27 17:10       ` Dan Rosenberg
2011-05-27 17:13         ` H. Peter Anvin
2011-05-27 17:16           ` Linus Torvalds
2011-05-27 17:38             ` Ingo Molnar
2011-05-27 17:20           ` Kees Cook
2011-05-27 17:16         ` Ingo Molnar
2011-05-27 17:21           ` Linus Torvalds
2011-05-27 17:46             ` Ingo Molnar
2011-05-27 17:53               ` H. Peter Anvin
2011-05-27 18:05                 ` Linus Torvalds
2011-05-27 19:15                   ` Vivek Goyal
2011-05-27 21:37                   ` H. Peter Anvin
2011-05-27 23:51                     ` H. Peter Anvin
2011-05-28 12:18                   ` Ingo Molnar
2011-05-29  1:13                     ` H. Peter Anvin
2011-05-29 12:47                       ` Ingo Molnar
2011-05-29 18:19                         ` H. Peter Anvin
2011-05-29 18:44                           ` Ingo Molnar
2011-05-29 18:52                             ` H. Peter Anvin
2011-05-29 19:56                               ` Ingo Molnar
2011-05-27 17:57               ` Linus Torvalds
2011-05-27 18:17                 ` Ingo Molnar
2011-05-27 18:43                   ` Kees Cook
2011-05-27 18:48                   ` david
2011-05-27 21:51                   ` Olivier Galibert
2011-05-27 22:11                     ` Valdis.Kletnieks
2011-05-28  0:50                     ` H. Peter Anvin
2011-05-28  6:32                     ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).