All of lore.kernel.org
 help / color / mirror / Atom feed
* [kernel-hardening] [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls
@ 2017-06-09 14:30 Alexander Popov
  2017-06-09 17:28 ` [kernel-hardening] " Kees Cook
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Alexander Popov @ 2017-06-09 14:30 UTC (permalink / raw)
  To: kernel-hardening, keescook, pageexec, spender, tycho, alex.popov

Hello,

My employer Positive Technologies kindly allowed me to spend a part of my
working time on helping KSPP. So I join the initiative of porting STACKLEAK
to the mainline, which was started by Tycho Andersen.

STACKLEAK is a security feature developed by Grsecurity/PaX (kudos to them!),
which can mitigate the damage from kernel stack leak bugs (see CVE-2010-2963,
CVE-2016-4569 and CVE-2016-4578).

I carefully extracted STACKLEAK from the last public patch of Grsecurity/PaX
and do my best to understand it. So I added some comments describing that
understanding. You are welcome to discuss it.

Here are the results of a brief performance test on x86_64:

Hardware: Intel Core i7-4770, 16 GB RAM

Test #1: stress-ng --cpu 4 --io 2 --vm 2 --vm-bytes 2G --timeout 300s
 --metrics-brief
Result on 4.11-rc8:
  cpu bogo ops 269955
  iosync bogo ops 9809985
  vm bogo ops 17093608
Result on 4.11-rc8+stackleak:
  cpu bogo ops 270106 (+0.6%)
  iosync bogo ops 9474535 (-3.42%)
  vm bogo ops 17093608 (the same)

Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P
Average result on 4.11-rc8: 8.71s
Average result on 4.11-rc8+stackleak: 9.08s (+4.29%)

Test #3: building the Linux kernel with Ubuntu config (time make -j9)
Result on 4.11-rc8:
  real	32m14.893s
  user	237m30.886s
  sys	11m12.273s
Result on 4.11-rc8+stackleak:
  real	32m26.881s (+0.62%)
  user	238m38.926s (+0.48%)
  sys	11m36.426s (+3.59%)

There is an important aspect of STACKLEAK on i386. PaX Team might know
about it, I can't say for sure. Dear Grsecurity, is it fine to talk with
you about such things in that mailing list? Should I do it differently?

The STACKLEAK gcc plugin does not instrument the kernel code for i386. I've
checked that for the last public patch of Grsecurity/PaX and see the same
behaviour on my machines. The reason: the ix86_cmodel for the Linux kernel
on i386 is not CM_KERNEL. So the STACKLEAK plugin seems to skip the
instrumentation for that platform. As a result, on i386 erase_kstack()
always starts to search for the bottom of the stack from the top minus 128.

See the results of the same performance tests on i386:

Test #1: stress-ng --cpu 4 --io 2 --vm 2 --vm-bytes 2G --timeout 300s
 --metrics-brief
Result on 4.11-rc8:
  cpu bogo ops 207754
  iosync bogo ops 9442815
  vm bogo ops 8546804
Result on 4.11-rc8+stackleak:
  cpu bogo ops 206061 (-0.81%)
  iosync bogo ops 9435139 (-0.08%)
  vm bogo ops 8546804 (the same)

Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P
Average result on 4.11-rc8: 8.197s
Average result on 4.11-rc8+stackleak: 9.134s (+11.43%)

Test #3: building the Linux kernel with Ubuntu config (time make -j9)
Result on 4.11-rc8:
  real	18m15.372s
  user	129m58.169s
  sys	8m27.884s
Result on 4.11-rc8+stackleak:
  real	18m34.244s (+1.72%)
  user	132m33.843s (+2.00%)
  sys	8m37.658s (+1.92%)

More things to be done:
 - understand how the STACKLEAK gcc plugin works;
 - develop tests for STACKLEAK.

Best regards,
Alexander

-- >8 --

>From d22af45233b2f6d657a29dcb1815b35a5a45c539 Mon Sep 17 00:00:00 2001
From: Alexander Popov <alex.popov@linux.com>
Date: Fri, 9 Jun 2017 15:21:16 +0300
Subject: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the
 kernel stack at the end of syscalls

The stackleak feature erases the kernel stack before returning from syscalls.
That reduces the information which a kernel stack leak bug can reveal.

This feature consists of:
  - the architecture-specific code filling the used part of the kernel stack
  with a poison value before returning to the userspace (currently only
  for i386 and x86_64);
  - the gcc plugin for tracking the lowest border of the kernel stack. It
  instruments the kernel code inserting the track_stack() function call if
  a stack frame size is over a specified limit.

The stackleak feature is ported from grsecurity/PaX. For more information see:
  https://grsecurity.net/
  https://pax.grsecurity.net/

This code is verbatim from Brad Spengler/PaX Team's code in the last public
patch of grsecurity/PaX based on our understanding of the code. Changes or
omissions from the original code are ours and don't reflect the original
grsecurity/PaX code.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
Signed-off-by: Tycho Andersen <tycho@docker.com>
---
 arch/Kconfig                           |  20 ++
 arch/x86/Kconfig                       |   1 +
 arch/x86/entry/common.c                |  17 +-
 arch/x86/entry/entry_32.S              |  71 +++++++
 arch/x86/entry/entry_64.S              |  99 ++++++++++
 arch/x86/entry/entry_64_compat.S       |   8 +
 arch/x86/include/asm/processor.h       |   4 +
 arch/x86/kernel/asm-offsets.c          |   9 +
 arch/x86/kernel/dumpstack_32.c         |  12 ++
 arch/x86/kernel/dumpstack_64.c         |  33 ++++
 arch/x86/kernel/process_32.c           |   5 +
 arch/x86/kernel/process_64.c           |   5 +
 fs/exec.c                              |  17 ++
 scripts/Makefile.gcc-plugins           |   3 +
 scripts/gcc-plugins/stackleak_plugin.c | 342 +++++++++++++++++++++++++++++++++
 15 files changed, 643 insertions(+), 3 deletions(-)
 create mode 100644 scripts/gcc-plugins/stackleak_plugin.c

diff --git a/arch/Kconfig b/arch/Kconfig
index cd211a1..a209bd5 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -436,6 +436,26 @@ config GCC_PLUGIN_STRUCTLEAK_VERBOSE
 	  initialized. Since not all existing initializers are detected
 	  by the plugin, this can produce false positive warnings.
 
+config ARCH_HAS_STACKLEAK
+	def_bool n
+	help
+	  An architecture should select this if it has the code which
+	  fills the used part of the kernel stack with a poison value
+	  before returning from system calls.
+
+config STACKLEAK
+	bool "Erase the kernel stack before returning from syscalls"
+	depends on GCC_PLUGINS
+	depends on ARCH_HAS_STACKLEAK
+	help
+	  This option makes the kernel erase the kernel stack before it
+	  returns from a system call. That reduces the information which
+	  a kernel stack leak bug can reveal.
+
+	  This plugin was ported from grsecurity/PaX. More information at:
+	   * https://grsecurity.net/
+	   * https://pax.grsecurity.net/
+
 config HAVE_CC_STACKPROTECTOR
 	bool
 	help
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index cc98d5a..5988a5f 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -56,6 +56,7 @@ config X86
 	select ARCH_HAS_PMEM_API		if X86_64
 	select ARCH_HAS_SET_MEMORY
 	select ARCH_HAS_SG_CHAIN
+	select ARCH_HAS_STACKLEAK
 	select ARCH_HAS_STRICT_KERNEL_RWX
 	select ARCH_HAS_STRICT_MODULE_RWX
 	select ARCH_HAS_UBSAN_SANITIZE_ALL
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 370c42c..8ff0a84 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -43,6 +43,12 @@ __visible inline void enter_from_user_mode(void)
 static inline void enter_from_user_mode(void) {}
 #endif
 
+#ifdef CONFIG_STACKLEAK
+asmlinkage void erase_kstack(void);
+#else
+static void erase_kstack(void) {}
+#endif
+
 static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch)
 {
 #ifdef CONFIG_X86_64
@@ -79,8 +85,10 @@ static long syscall_trace_enter(struct pt_regs *regs)
 		emulated = true;
 
 	if ((emulated || (work & _TIF_SYSCALL_TRACE)) &&
-	    tracehook_report_syscall_entry(regs))
+	    tracehook_report_syscall_entry(regs)) {
+		erase_kstack();
 		return -1L;
+	}
 
 	if (emulated)
 		return -1L;
@@ -114,9 +122,11 @@ static long syscall_trace_enter(struct pt_regs *regs)
 			sd.args[5] = regs->bp;
 		}
 
-		ret = __secure_computing(&sd);
-		if (ret == -1)
+		ret = secure_computing(&sd);
+		if (ret == -1) {
+			erase_kstack();
 			return ret;
+		}
 	}
 #endif
 
@@ -125,6 +135,7 @@ static long syscall_trace_enter(struct pt_regs *regs)
 
 	do_audit_syscall_entry(regs, arch);
 
+	erase_kstack();
 	return ret ?: regs->orig_ax;
 }
 
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 57f7ec3..d8610e9 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -78,6 +78,73 @@
 #endif
 .endm
 
+.macro erase_kstack
+#ifdef CONFIG_STACKLEAK
+	call erase_kstack
+#endif
+.endm
+
+#ifdef CONFIG_STACKLEAK
+/*
+ * ebp: thread_info
+ */
+ENTRY(erase_kstack)
+	pushl	%edi
+	pushl	%ecx
+	pushl	%eax
+	pushl	%ebp
+
+	movl	PER_CPU_VAR(current_task), %ebp
+	mov	TASK_lowest_stack(%ebp), %edi
+	mov	$-0xBEEF, %eax
+	std
+
+1:
+	mov	%edi, %ecx
+	and	$THREAD_SIZE_asm - 1, %ecx
+	shr	$2, %ecx
+	repne	scasl
+	jecxz	2f
+
+	cmp	$2*16, %ecx
+	jc	2f
+
+	mov	$2*16, %ecx
+	repe	scasl
+	jecxz	2f
+	jne	1b
+
+2:
+	cld
+	or	$2*4, %edi
+	mov	%esp, %ecx
+	sub	%edi, %ecx
+
+	cmp	$THREAD_SIZE_asm, %ecx
+	jb	3f
+	ud2
+
+3:
+	shr	$2, %ecx
+	rep	stosl
+
+	/*
+	 * TODO: sp0 on x86_32 is not reliable, right?
+	 * Doubt because of the definition of cpu_current_top_of_stack
+	 * in arch/x86/kernel/cpu/common.c.
+	 */
+	mov	TASK_thread_sp0(%ebp), %edi
+	sub	$128, %edi
+	mov	%edi, TASK_lowest_stack(%ebp)
+
+	popl	%ebp
+	popl	%eax
+	popl	%ecx
+	popl	%edi
+	ret
+ENDPROC(erase_kstack)
+#endif
+
 /*
  * User gs save/restore
  *
@@ -440,6 +507,8 @@ ENTRY(entry_SYSENTER_32)
 	ALTERNATIVE "testl %eax, %eax; jz .Lsyscall_32_done", \
 		    "jmp .Lsyscall_32_done", X86_FEATURE_XENPV
 
+	erase_kstack
+
 /* Opportunistic SYSEXIT */
 	TRACE_IRQS_ON			/* User mode traces as IRQs on. */
 	movl	PT_EIP(%esp), %edx	/* pt_regs->ip */
@@ -526,6 +595,8 @@ ENTRY(entry_INT80_32)
 	call	do_int80_syscall_32
 .Lsyscall_32_done:
 
+	erase_kstack
+
 restore_all:
 	TRACE_IRQS_IRET
 .Lrestore_all_notrace:
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 044d18e..84829d4 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -58,6 +58,94 @@ ENDPROC(native_usergs_sysret64)
 #endif
 .endm
 
+.macro erase_kstack
+#ifdef CONFIG_STACKLEAK
+	call erase_kstack
+#endif
+.endm
+
+#ifdef CONFIG_STACKLEAK
+ENTRY(erase_kstack)
+	pushq	%rdi
+	pushq	%rcx
+	pushq	%rax
+	pushq	%r11
+
+	movq	PER_CPU_VAR(current_task), %r11
+	mov	TASK_lowest_stack(%r11), %rdi
+	mov	$-0xBEEF, %rax		/* -0xBEEF is a stack poison. */
+	std
+
+1:
+	/*
+	 * Let's search for the poison value in the stack.
+	 * Start from the lowest_stack and go to the bottom (see std above).
+	 */
+	mov	%edi, %ecx
+	and	$THREAD_SIZE_asm - 1, %ecx
+	shr	$3, %ecx
+	repne	scasq
+	jecxz	2f	/* Didn't find it. Go to poisoning. */
+
+	/*
+	 * Found the poison value in the stack. Go to poisoning if there are
+	 * less than 16 qwords left.
+	 */
+	cmp	$2*8, %ecx
+	jc	2f
+
+	/*
+	 * Check that 16 further qwords contain poison (avoid false positives).
+	 * If so, the part of the stack below the address in %rdi is likely
+	 * to be poisoned. Otherwise we need to search deeper.
+	 */
+	mov	$2*8, %ecx
+	repe	scasq
+	jecxz	2f	/* Poison the upper part of the stack. */
+	jne	1b	/* Search deeper. */
+
+2:
+	/*
+	 * Prepare the counter for poisoning the kernel stack between
+	 * %rdi and %rsp.
+	 *
+	 * TODO: Sorry, don't understand why the following OR instruction is
+	 * needed. That may be connected to the thread.lowest_stack
+	 * initialization in arch/x86/kernel/process_64.c, where it is set
+	 * to the task_stack_page address + 2 * sizeof(unsigned long).
+	 */
+	cld
+	or	$2*8, %rdi
+	mov	%esp, %ecx
+	sub	%edi, %ecx
+
+	/* Check that the counter value is sane. */
+	cmp	$THREAD_SIZE_asm, %rcx
+	jb	3f
+	ud2
+
+3:
+	/*
+	 * So let's write the poison value to the kernel stack. Start from the
+	 * address in %rdi and move up (see cld above) to the address in %rsp
+	 * (not included, used memory).
+	 */
+	shr	$3, %ecx
+	rep	stosq
+
+	/* Set the lowest_stack value to the top_of_stack - 256. */
+	mov	TASK_thread_sp0(%r11), %rdi
+	sub	$256, %rdi
+	mov	%rdi, TASK_lowest_stack(%r11)
+
+	popq	%r11
+	popq	%rax
+	popq	%rcx
+	popq	%rdi
+	ret
+ENDPROC(erase_kstack)
+#endif
+
 /*
  * When dynamic function tracer is enabled it will add a breakpoint
  * to all locations that it is about to modify, sync CPUs, update
@@ -218,6 +306,8 @@ entry_SYSCALL_64_fastpath:
 	testl	$_TIF_ALLWORK_MASK, TASK_TI_flags(%r11)
 	jnz	1f
 
+	erase_kstack
+
 	LOCKDEP_SYS_EXIT
 	TRACE_IRQS_ON		/* user mode is traced as IRQs on */
 	movq	RIP(%rsp), %rcx
@@ -246,6 +336,8 @@ entry_SYSCALL64_slow_path:
 	call	do_syscall_64		/* returns with IRQs disabled */
 
 return_from_SYSCALL_64:
+	erase_kstack
+
 	RESTORE_EXTRA_REGS
 	TRACE_IRQS_IRETQ		/* we're about to change IF */
 
@@ -419,6 +511,7 @@ ENTRY(ret_from_fork)
 2:
 	leaq	FRAME_OFFSET(%rsp),%rdi	/* pt_regs pointer */
 	call	syscall_return_slowpath	/* returns with IRQs disabled */
+	erase_kstack
 	TRACE_IRQS_ON			/* user mode is traced as IRQS on */
 	SWAPGS
 	FRAME_END
@@ -532,6 +625,12 @@ ret_from_intr:
 GLOBAL(retint_user)
 	mov	%rsp,%rdi
 	call	prepare_exit_to_usermode
+
+	/*
+	 * TODO: Do we need to call erase_kstack here?
+	 * The PaX patch has it here commented out.
+	 */
+
 	TRACE_IRQS_IRETQ
 	SWAPGS
 	jmp	restore_regs_and_iret
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index e1721da..330516a 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -18,6 +18,12 @@
 
 	.section .entry.text, "ax"
 
+	.macro erase_kstack
+#ifdef CONFIG_STACKLEAK
+	call erase_kstack
+#endif
+	.endm
+
 /*
  * 32-bit SYSENTER entry.
  *
@@ -229,6 +235,7 @@ ENTRY(entry_SYSCALL_compat)
 
 	/* Opportunistic SYSRET */
 sysret32_from_system_call:
+	erase_kstack
 	TRACE_IRQS_ON			/* User mode traces as IRQs on. */
 	movq	RBX(%rsp), %rbx		/* pt_regs->rbx */
 	movq	RBP(%rsp), %rbp		/* pt_regs->rbp */
@@ -337,6 +344,7 @@ ENTRY(entry_INT80_compat)
 .Lsyscall_32_done:
 
 	/* Go back to user mode. */
+	erase_kstack
 	TRACE_IRQS_ON
 	SWAPGS
 	jmp	restore_regs_and_iret
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index f385eca..049abff 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -469,6 +469,10 @@ struct thread_struct {
 
 	mm_segment_t		addr_limit;
 
+#ifdef CONFIG_STACKLEAK
+	unsigned long		lowest_stack;
+#endif
+
 	unsigned int		sig_on_uaccess_err:1;
 	unsigned int		uaccess_err:1;	/* uaccess failed */
 
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index de827d6..31ba8e9 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -37,6 +37,10 @@ void common(void) {
 	BLANK();
 	OFFSET(TASK_TI_flags, task_struct, thread_info.flags);
 	OFFSET(TASK_addr_limit, task_struct, thread.addr_limit);
+#ifdef CONFIG_STACKLEAK
+	OFFSET(TASK_lowest_stack, task_struct, thread.lowest_stack);
+	OFFSET(TASK_thread_sp0, task_struct, thread.sp0);
+#endif
 
 	BLANK();
 	OFFSET(crypto_tfm_ctx_offset, crypto_tfm, __crt_ctx);
@@ -73,6 +77,11 @@ void common(void) {
 	OFFSET(PV_MMU_read_cr2, pv_mmu_ops, read_cr2);
 #endif
 
+#ifdef CONFIG_STACKLEAK
+	BLANK();
+	DEFINE(THREAD_SIZE_asm, THREAD_SIZE);
+#endif
+
 #ifdef CONFIG_XEN
 	BLANK();
 	OFFSET(XEN_vcpu_info_mask, vcpu_info, evtchn_upcall_mask);
diff --git a/arch/x86/kernel/dumpstack_32.c b/arch/x86/kernel/dumpstack_32.c
index b0b3a3d..a223b4e 100644
--- a/arch/x86/kernel/dumpstack_32.c
+++ b/arch/x86/kernel/dumpstack_32.c
@@ -174,3 +174,15 @@ int is_valid_bugaddr(unsigned long ip)
 
 	return ud2 == 0x0b0f;
 }
+
+#ifdef CONFIG_STACKLEAK
+void __used check_alloca(unsigned long size)
+{
+	unsigned long sp = (unsigned long)&sp, stack_left;
+
+	/* all kernel stacks are of the same size */
+	stack_left = sp & (THREAD_SIZE - 1);
+	BUG_ON(stack_left < 256 || size >= stack_left - 256);
+}
+EXPORT_SYMBOL(check_alloca);
+#endif
diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
index a8b117e..c2ffb9f 100644
--- a/arch/x86/kernel/dumpstack_64.c
+++ b/arch/x86/kernel/dumpstack_64.c
@@ -188,3 +188,36 @@ int is_valid_bugaddr(unsigned long ip)
 
 	return ud2 == 0x0b0f;
 }
+
+#ifdef CONFIG_STACKLEAK
+void __used check_alloca(unsigned long size)
+{
+	struct stack_info stack_info = {0};
+	unsigned long visit_mask = 0;
+	unsigned long sp = (unsigned long)&sp;
+	unsigned long stack_left;
+
+	BUG_ON(get_stack_info(&sp, current, &stack_info, &visit_mask));
+
+	switch (stack_info.type) {
+	case STACK_TYPE_TASK:
+		stack_left = sp & (THREAD_SIZE - 1);
+		break;
+
+	case STACK_TYPE_IRQ:
+		stack_left = sp & (IRQ_STACK_SIZE - 1);
+		break;
+
+	case STACK_TYPE_EXCEPTION ... STACK_TYPE_EXCEPTION_LAST:
+		stack_left = sp & (EXCEPTION_STKSZ - 1);
+		break;
+
+	case STACK_TYPE_SOFTIRQ:
+	default:
+		BUG();
+	}
+
+	BUG_ON(stack_left < 256 || size >= stack_left - 256);
+}
+EXPORT_SYMBOL(check_alloca);
+#endif
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 4c818f8..5355b12 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -134,6 +134,11 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
 	p->thread.sp0 = (unsigned long) (childregs+1);
 	memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps));
 
+#ifdef CONFIG_STACKLEAK
+	p->thread.lowest_stack = (unsigned long)task_stack_page(p) +
+						2 * sizeof(unsigned long);
+#endif
+
 	if (unlikely(p->flags & PF_KTHREAD)) {
 		/* kernel thread */
 		memset(childregs, 0, sizeof(struct pt_regs));
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index d6b784a..f943558 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -161,6 +161,11 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
 	p->thread.sp = (unsigned long) fork_frame;
 	p->thread.io_bitmap_ptr = NULL;
 
+#ifdef CONFIG_STACKLEAK
+	p->thread.lowest_stack = (unsigned long)task_stack_page(p) +
+						2 * sizeof(unsigned long);
+#endif
+
 	savesegment(gs, p->thread.gsindex);
 	p->thread.gsbase = p->thread.gsindex ? 0 : me->thread.gsbase;
 	savesegment(fs, p->thread.fsindex);
diff --git a/fs/exec.c b/fs/exec.c
index 65145a3..c041611 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1927,3 +1927,20 @@ COMPAT_SYSCALL_DEFINE5(execveat, int, fd,
 				  argv, envp, flags);
 }
 #endif
+
+#ifdef CONFIG_STACKLEAK
+void __used track_stack(void)
+{
+	unsigned long sp = (unsigned long)&sp;
+
+	if (sp < current->thread.lowest_stack &&
+	    sp >= (unsigned long)task_stack_page(current) +
+					2 * sizeof(unsigned long)) {
+		current->thread.lowest_stack = sp;
+	}
+
+	if (unlikely((sp & ~(THREAD_SIZE - 1)) < (THREAD_SIZE / 16)))
+		BUG();
+}
+EXPORT_SYMBOL(track_stack);
+#endif
diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins
index 8233553..6cf9487 100644
--- a/scripts/Makefile.gcc-plugins
+++ b/scripts/Makefile.gcc-plugins
@@ -29,6 +29,9 @@ ifdef CONFIG_GCC_PLUGINS
   gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_STRUCTLEAK_VERBOSE)	+= -fplugin-arg-structleak_plugin-verbose
   gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_STRUCTLEAK)	+= -DSTRUCTLEAK_PLUGIN
 
+  gcc-plugin-$(CONFIG_STACKLEAK)	+= stackleak_plugin.so
+  gcc-plugin-cflags-$(CONFIG_STACKLEAK)	+= -DSTACKLEAK_PLUGIN -fplugin-arg-stackleak_plugin-track-lowest-sp=100
+
   GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y))
 
   export PLUGINCC GCC_PLUGINS_CFLAGS GCC_PLUGIN GCC_PLUGIN_SUBDIR
diff --git a/scripts/gcc-plugins/stackleak_plugin.c b/scripts/gcc-plugins/stackleak_plugin.c
new file mode 100644
index 0000000..2ee49c4
--- /dev/null
+++ b/scripts/gcc-plugins/stackleak_plugin.c
@@ -0,0 +1,342 @@
+/*
+ * Copyright 2011-2017 by the PaX Team <pageexec@freemail.hu>
+ * Licensed under the GPL v2
+ *
+ * Note: the choice of the license means that the compilation process is
+ *       NOT 'eligible' as defined by gcc's library exception to the GPL v3,
+ *       but for the kernel it doesn't matter since it doesn't link against
+ *       any of the gcc libraries
+ *
+ * gcc plugin to help implement various PaX features
+ *
+ * - track lowest stack pointer
+ *
+ * TODO:
+ * - initialize all local variables
+ *
+ * BUGS:
+ * - none known
+ */
+
+#include "gcc-common.h"
+
+__visible int plugin_is_GPL_compatible;
+
+static int track_frame_size = -1;
+static const char track_function[] = "track_stack";
+static const char check_function[] = "check_alloca";
+static GTY(()) tree track_function_decl;
+static GTY(()) tree check_function_decl;
+static bool init_locals;
+
+static struct plugin_info stackleak_plugin_info = {
+	.version	= "201602181345",
+	.help		= "track-lowest-sp=nn\ttrack sp in functions whose frame size is at least nn bytes\n"
+//			  "initialize-locals\t\tforcibly initialize all stack frames\n"
+};
+
+static void stackleak_check_alloca(gimple_stmt_iterator *gsi)
+{
+	gimple stmt;
+	gcall *check_alloca;
+	tree alloca_size;
+	cgraph_node_ptr node;
+	int frequency;
+	basic_block bb;
+
+	// insert call to void check_alloca(unsigned long size)
+	alloca_size = gimple_call_arg(gsi_stmt(*gsi), 0);
+	stmt = gimple_build_call(check_function_decl, 1, alloca_size);
+	check_alloca = as_a_gcall(stmt);
+	gsi_insert_before(gsi, check_alloca, GSI_SAME_STMT);
+
+	// update the cgraph
+	bb = gimple_bb(check_alloca);
+	node = cgraph_get_create_node(check_function_decl);
+	gcc_assert(node);
+	frequency = compute_call_stmt_bb_frequency(current_function_decl, bb);
+	cgraph_create_edge(cgraph_get_node(current_function_decl), node, check_alloca, bb->count, frequency, bb->loop_depth);
+}
+
+static void stackleak_add_instrumentation(gimple_stmt_iterator *gsi, bool after)
+{
+	gimple stmt;
+	gcall *track_stack;
+	cgraph_node_ptr node;
+	int frequency;
+	basic_block bb;
+
+	// insert call to void track_stack(void)
+	stmt = gimple_build_call(track_function_decl, 0);
+	track_stack = as_a_gcall(stmt);
+	if (after)
+		gsi_insert_after(gsi, track_stack, GSI_CONTINUE_LINKING);
+	else
+		gsi_insert_before(gsi, track_stack, GSI_SAME_STMT);
+
+	// update the cgraph
+	bb = gimple_bb(track_stack);
+	node = cgraph_get_create_node(track_function_decl);
+	gcc_assert(node);
+	frequency = compute_call_stmt_bb_frequency(current_function_decl, bb);
+	cgraph_create_edge(cgraph_get_node(current_function_decl), node, track_stack, bb->count, frequency, bb->loop_depth);
+}
+
+static bool is_alloca(gimple stmt)
+{
+	if (gimple_call_builtin_p(stmt, BUILT_IN_ALLOCA))
+		return true;
+
+#if BUILDING_GCC_VERSION >= 4007
+	if (gimple_call_builtin_p(stmt, BUILT_IN_ALLOCA_WITH_ALIGN))
+		return true;
+#endif
+
+	return false;
+}
+
+static unsigned int stackleak_tree_instrument_execute(void)
+{
+	basic_block bb, entry_bb;
+	bool prologue_instrumented = false, is_leaf = true;
+
+	entry_bb = ENTRY_BLOCK_PTR_FOR_FN(cfun)->next_bb;
+
+	// 1. loop through BBs and GIMPLE statements
+	FOR_EACH_BB_FN(bb, cfun) {
+		gimple_stmt_iterator gsi;
+
+		for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) {
+			gimple stmt;
+
+			stmt = gsi_stmt(gsi);
+
+			if (is_gimple_call(stmt))
+				is_leaf = false;
+
+			// gimple match: align 8 built-in BUILT_IN_NORMAL:BUILT_IN_ALLOCA attributes <tree_list 0xb7576450>
+			if (!is_alloca(stmt))
+				continue;
+
+			// 2. insert stack overflow check before each __builtin_alloca call
+			stackleak_check_alloca(&gsi);
+
+			// 3. insert track call after each __builtin_alloca call
+			stackleak_add_instrumentation(&gsi, true);
+			if (bb == entry_bb)
+				prologue_instrumented = true;
+		}
+	}
+
+	// special cases for some bad linux code: taking the address of static inline functions will materialize them
+	// but we mustn't instrument some of them as the resulting stack alignment required by the function call ABI
+	// will break other assumptions regarding the expected (but not otherwise enforced) register clobbering  ABI.
+	// case in point: native_save_fl on amd64 when optimized for size clobbers rdx if it were instrumented here.
+	if (is_leaf && !TREE_PUBLIC(current_function_decl) && DECL_DECLARED_INLINE_P(current_function_decl))
+		return 0;
+	if (is_leaf && !strncmp(IDENTIFIER_POINTER(DECL_NAME(current_function_decl)), "_paravirt_", 10))
+		return 0;
+
+	// 4. insert track call at the beginning
+	if (!prologue_instrumented) {
+		gimple_stmt_iterator gsi;
+
+		gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+		bb = single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun));
+		if (!single_pred_p(bb)) {
+//			gcc_assert(bb_loop_depth(bb) || (bb->flags & BB_IRREDUCIBLE_LOOP));
+			split_edge(single_succ_edge(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+			gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+			bb = single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun));
+		}
+		gsi = gsi_after_labels(bb);
+		stackleak_add_instrumentation(&gsi, false);
+	}
+
+	return 0;
+}
+
+static unsigned int stackleak_final_execute(void)
+{
+	rtx_insn *insn, *next;
+
+	if (cfun->calls_alloca)
+		return 0;
+
+	// keep calls only if function frame is big enough
+	if (get_frame_size() >= track_frame_size)
+		return 0;
+
+	// 1. find track_stack calls
+	for (insn = get_insns(); insn; insn = next) {
+		// rtl match: (call_insn 8 7 9 3 (call (mem (symbol_ref ("track_stack") [flags 0x41] <function_decl 0xb7470e80 track_stack>) [0 S1 A8]) (4)) -1 (nil) (nil))
+		rtx body;
+
+		next = NEXT_INSN(insn);
+		if (!CALL_P(insn))
+			continue;
+		body = PATTERN(insn);
+		if (GET_CODE(body) != CALL)
+			continue;
+		body = XEXP(body, 0);
+		if (GET_CODE(body) != MEM)
+			continue;
+		body = XEXP(body, 0);
+		if (GET_CODE(body) != SYMBOL_REF)
+			continue;
+//		if (strcmp(XSTR(body, 0), track_function))
+		if (SYMBOL_REF_DECL(body) != track_function_decl)
+			continue;
+//		warning(0, "track_frame_size: %d %ld %d", cfun->calls_alloca, get_frame_size(), track_frame_size);
+		// 2. delete call
+		delete_insn_and_edges(insn);
+#if BUILDING_GCC_VERSION >= 4007
+		if (GET_CODE(next) == NOTE && NOTE_KIND(next) == NOTE_INSN_CALL_ARG_LOCATION) {
+			insn = next;
+			next = NEXT_INSN(insn);
+			delete_insn_and_edges(insn);
+		}
+#endif
+	}
+
+//	print_simple_rtl(stderr, get_insns());
+//	print_rtl(stderr, get_insns());
+//	warning(0, "track_frame_size: %d %ld %d", cfun->calls_alloca, get_frame_size(), track_frame_size);
+
+	return 0;
+}
+
+static bool stackleak_track_stack_gate(void)
+{
+	tree section;
+
+	if (ix86_cmodel != CM_KERNEL)
+		return false;
+
+	section = lookup_attribute("section", DECL_ATTRIBUTES(current_function_decl));
+	if (section && TREE_VALUE(section)) {
+		section = TREE_VALUE(TREE_VALUE(section));
+
+		if (!strncmp(TREE_STRING_POINTER(section), ".init.text", 10))
+			return false;
+		if (!strncmp(TREE_STRING_POINTER(section), ".devinit.text", 13))
+			return false;
+		if (!strncmp(TREE_STRING_POINTER(section), ".cpuinit.text", 13))
+			return false;
+		if (!strncmp(TREE_STRING_POINTER(section), ".meminit.text", 13))
+			return false;
+	}
+
+	return track_frame_size >= 0;
+}
+
+static void stackleak_start_unit(void *gcc_data __unused, void *user_data __unused)
+{
+	tree fntype;
+
+	// void track_stack(void)
+	fntype = build_function_type_list(void_type_node, NULL_TREE);
+	track_function_decl = build_fn_decl(track_function, fntype);
+	DECL_ASSEMBLER_NAME(track_function_decl); // for LTO
+	TREE_PUBLIC(track_function_decl) = 1;
+	TREE_USED(track_function_decl) = 1;
+	DECL_EXTERNAL(track_function_decl) = 1;
+	DECL_ARTIFICIAL(track_function_decl) = 1;
+	DECL_PRESERVE_P(track_function_decl) = 1;
+
+	// void check_alloca(unsigned long)
+	fntype = build_function_type_list(void_type_node, long_unsigned_type_node, NULL_TREE);
+	check_function_decl = build_fn_decl(check_function, fntype);
+	DECL_ASSEMBLER_NAME(check_function_decl); // for LTO
+	TREE_PUBLIC(check_function_decl) = 1;
+	TREE_USED(check_function_decl) = 1;
+	DECL_EXTERNAL(check_function_decl) = 1;
+	DECL_ARTIFICIAL(check_function_decl) = 1;
+	DECL_PRESERVE_P(check_function_decl) = 1;
+}
+
+static bool stackleak_tree_instrument_gate(void)
+{
+	return stackleak_track_stack_gate();
+}
+
+#define PASS_NAME stackleak_tree_instrument
+#define PROPERTIES_REQUIRED PROP_gimple_leh | PROP_cfg
+#define TODO_FLAGS_START TODO_verify_ssa | TODO_verify_flow | TODO_verify_stmts
+#define TODO_FLAGS_FINISH TODO_verify_ssa | TODO_verify_stmts | TODO_dump_func | TODO_update_ssa | TODO_rebuild_cgraph_edges
+#include "gcc-generate-gimple-pass.h"
+
+static bool stackleak_final_gate(void)
+{
+	return stackleak_track_stack_gate();
+}
+
+#define PASS_NAME stackleak_final
+#define TODO_FLAGS_FINISH TODO_dump_func
+#include "gcc-generate-rtl-pass.h"
+
+__visible int plugin_init(struct plugin_name_args *plugin_info, struct plugin_gcc_version *version)
+{
+	const char * const plugin_name = plugin_info->base_name;
+	const int argc = plugin_info->argc;
+	const struct plugin_argument * const argv = plugin_info->argv;
+	int i;
+
+	static const struct ggc_root_tab gt_ggc_r_gt_stackleak[] = {
+		{
+			.base = &track_function_decl,
+			.nelt = 1,
+			.stride = sizeof(track_function_decl),
+			.cb = &gt_ggc_mx_tree_node,
+			.pchw = &gt_pch_nx_tree_node
+		},
+		{
+			.base = &check_function_decl,
+			.nelt = 1,
+			.stride = sizeof(check_function_decl),
+			.cb = &gt_ggc_mx_tree_node,
+			.pchw = &gt_pch_nx_tree_node
+		},
+		LAST_GGC_ROOT_TAB
+	};
+
+//	PASS_INFO(stackleak_tree_instrument, "tree_profile", 1, PASS_POS_INSERT_BEFORE);
+	PASS_INFO(stackleak_tree_instrument, "optimized", 1, PASS_POS_INSERT_BEFORE);
+	PASS_INFO(stackleak_final, "final", 1, PASS_POS_INSERT_BEFORE);
+
+	if (!plugin_default_version_check(version, &gcc_version)) {
+		error(G_("incompatible gcc/plugin versions"));
+		return 1;
+	}
+
+	register_callback(plugin_name, PLUGIN_INFO, NULL, &stackleak_plugin_info);
+
+	for (i = 0; i < argc; ++i) {
+		if (!strcmp(argv[i].key, "track-lowest-sp")) {
+			if (!argv[i].value) {
+				error(G_("no value supplied for option '-fplugin-arg-%s-%s'"), plugin_name, argv[i].key);
+				continue;
+			}
+			track_frame_size = atoi(argv[i].value);
+			if (argv[i].value[0] < '0' || argv[i].value[0] > '9' || track_frame_size < 0)
+				error(G_("invalid option argument '-fplugin-arg-%s-%s=%s'"), plugin_name, argv[i].key, argv[i].value);
+			continue;
+		}
+		if (!strcmp(argv[i].key, "initialize-locals")) {
+			if (argv[i].value) {
+				error(G_("invalid option argument '-fplugin-arg-%s-%s=%s'"), plugin_name, argv[i].key, argv[i].value);
+				continue;
+			}
+			init_locals = true;
+			continue;
+		}
+		error(G_("unknown option '-fplugin-arg-%s-%s'"), plugin_name, argv[i].key);
+	}
+
+	register_callback(plugin_name, PLUGIN_START_UNIT, &stackleak_start_unit, NULL);
+	register_callback(plugin_name, PLUGIN_REGISTER_GGC_ROOTS, NULL, (void *)&gt_ggc_r_gt_stackleak);
+	register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL, &stackleak_tree_instrument_pass_info);
+	register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL, &stackleak_final_pass_info);
+
+	return 0;
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls
  2017-06-09 14:30 [kernel-hardening] [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls Alexander Popov
@ 2017-06-09 17:28 ` Kees Cook
  2017-06-09 23:00   ` Alexander Popov
                     ` (2 more replies)
  2017-06-20  9:06 ` [kernel-hardening] " Hector Martin "marcan"
  2017-06-20 19:14 ` [kernel-hardening] " Kees Cook
  2 siblings, 3 replies; 30+ messages in thread
From: Kees Cook @ 2017-06-09 17:28 UTC (permalink / raw)
  To: Alexander Popov; +Cc: kernel-hardening, PaX Team, Brad Spengler, Tycho Andersen

On Fri, Jun 9, 2017 at 7:30 AM, Alexander Popov <alex.popov@linux.com> wrote:
> Hello,
>
> My employer Positive Technologies kindly allowed me to spend a part of my
> working time on helping KSPP. So I join the initiative of porting STACKLEAK
> to the mainline, which was started by Tycho Andersen.

Great, thanks for working on this!

> STACKLEAK is a security feature developed by Grsecurity/PaX (kudos to them!),
> which can mitigate the damage from kernel stack leak bugs (see CVE-2010-2963,
> CVE-2016-4569 and CVE-2016-4578).

It might help to distinguish that this covers at least two classes of
issues: stack content exposure (the latter two CVEs above), and it can
block some uninitialized variable attacks (like the mentioned
CVE-2010-2963). In other words, this can be more than just an info
exposure defense.

> I carefully extracted STACKLEAK from the last public patch of Grsecurity/PaX
> and do my best to understand it. So I added some comments describing that
> understanding. You are welcome to discuss it.

Great, adding more comments to the gcc plugin will help others who are
trying to get up to speed on it (like me). :)

> Here are the results of a brief performance test on x86_64:
>
> Hardware: Intel Core i7-4770, 16 GB RAM
>
> Test #1: stress-ng --cpu 4 --io 2 --vm 2 --vm-bytes 2G --timeout 300s
>  --metrics-brief
> Result on 4.11-rc8:
>   cpu bogo ops 269955
>   iosync bogo ops 9809985
>   vm bogo ops 17093608
> Result on 4.11-rc8+stackleak:
>   cpu bogo ops 270106 (+0.6%)
>   iosync bogo ops 9474535 (-3.42%)
>   vm bogo ops 17093608 (the same)
>
> Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P
> Average result on 4.11-rc8: 8.71s
> Average result on 4.11-rc8+stackleak: 9.08s (+4.29%)
>
> Test #3: building the Linux kernel with Ubuntu config (time make -j9)
> Result on 4.11-rc8:
>   real  32m14.893s
>   user  237m30.886s
>   sys   11m12.273s
> Result on 4.11-rc8+stackleak:
>   real  32m26.881s (+0.62%)
>   user  238m38.926s (+0.48%)
>   sys   11m36.426s (+3.59%)

Awesome, and thanks for the benchmarks! That should really help people
understand the trade-offs for using this feature (and is likely worth
mentioning in the Kconfig). Seems like less than 4% overhead, maybe
much less? Real time on build times seems like a tiny difference, but
hackbench shows 4%.

> There is an important aspect of STACKLEAK on i386. PaX Team might know
> about it, I can't say for sure. Dear Grsecurity, is it fine to talk with
> you about such things in that mailing list? Should I do it differently?
>
> The STACKLEAK gcc plugin does not instrument the kernel code for i386. I've
> checked that for the last public patch of Grsecurity/PaX and see the same
> behaviour on my machines. The reason: the ix86_cmodel for the Linux kernel
> on i386 is not CM_KERNEL. So the STACKLEAK plugin seems to skip the
> instrumentation for that platform. As a result, on i386 erase_kstack()
> always starts to search for the bottom of the stack from the top minus 128.

What version of gcc did you use for these builds, btw? Perhaps
something changed there?

> See the results of the same performance tests on i386:
>
> Test #1: stress-ng --cpu 4 --io 2 --vm 2 --vm-bytes 2G --timeout 300s
>  --metrics-brief
> Result on 4.11-rc8:
>   cpu bogo ops 207754
>   iosync bogo ops 9442815
>   vm bogo ops 8546804
> Result on 4.11-rc8+stackleak:
>   cpu bogo ops 206061 (-0.81%)
>   iosync bogo ops 9435139 (-0.08%)
>   vm bogo ops 8546804 (the same)
>
> Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P
> Average result on 4.11-rc8: 8.197s
> Average result on 4.11-rc8+stackleak: 9.134s (+11.43%)
>
> Test #3: building the Linux kernel with Ubuntu config (time make -j9)
> Result on 4.11-rc8:
>   real  18m15.372s
>   user  129m58.169s
>   sys   8m27.884s
> Result on 4.11-rc8+stackleak:
>   real  18m34.244s (+1.72%)
>   user  132m33.843s (+2.00%)
>   sys   8m37.658s (+1.92%)
>
> More things to be done:
>  - understand how the STACKLEAK gcc plugin works;
>  - develop tests for STACKLEAK.

Since this is mostly an anti-exposure defense, LKDTM is probably not a
good match (i.e. organizing a test for the uninit variable case can be
very fragile). I think something similar to test_user_copy.c would be
better.

> From d22af45233b2f6d657a29dcb1815b35a5a45c539 Mon Sep 17 00:00:00 2001
> From: Alexander Popov <alex.popov@linux.com>
> Date: Fri, 9 Jun 2017 15:21:16 +0300
> Subject: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the
>  kernel stack at the end of syscalls
>
> The stackleak feature erases the kernel stack before returning from syscalls.
> That reduces the information which a kernel stack leak bug can reveal.
>
> This feature consists of:
>   - the architecture-specific code filling the used part of the kernel stack
>   with a poison value before returning to the userspace (currently only
>   for i386 and x86_64);
>   - the gcc plugin for tracking the lowest border of the kernel stack. It
>   instruments the kernel code inserting the track_stack() function call if
>   a stack frame size is over a specified limit.
>
> The stackleak feature is ported from grsecurity/PaX. For more information see:
>   https://grsecurity.net/
>   https://pax.grsecurity.net/
>
> This code is verbatim from Brad Spengler/PaX Team's code in the last public

Maybe "nearly verbatim" since Kconfigs and such were (correctly)
renamed/expanded, and comments were added.

> patch of grsecurity/PaX based on our understanding of the code. Changes or
> omissions from the original code are ours and don't reflect the original
> grsecurity/PaX code.
>
> Signed-off-by: Alexander Popov <alex.popov@linux.com>
> Signed-off-by: Tycho Andersen <tycho@docker.com>

This S-o-b is not right, I think it should just be yours. Perhaps say
something like "Continued upstreaming work started by Tycho Anderson."
in the commit log.

> ---
>  arch/Kconfig                           |  20 ++
>  arch/x86/Kconfig                       |   1 +
>  arch/x86/entry/common.c                |  17 +-
>  arch/x86/entry/entry_32.S              |  71 +++++++
>  arch/x86/entry/entry_64.S              |  99 ++++++++++
>  arch/x86/entry/entry_64_compat.S       |   8 +
>  arch/x86/include/asm/processor.h       |   4 +
>  arch/x86/kernel/asm-offsets.c          |   9 +
>  arch/x86/kernel/dumpstack_32.c         |  12 ++
>  arch/x86/kernel/dumpstack_64.c         |  33 ++++
>  arch/x86/kernel/process_32.c           |   5 +
>  arch/x86/kernel/process_64.c           |   5 +
>  fs/exec.c                              |  17 ++
>  scripts/Makefile.gcc-plugins           |   3 +
>  scripts/gcc-plugins/stackleak_plugin.c | 342 +++++++++++++++++++++++++++++++++
>  15 files changed, 643 insertions(+), 3 deletions(-)
>  create mode 100644 scripts/gcc-plugins/stackleak_plugin.c
>
> diff --git a/arch/Kconfig b/arch/Kconfig
> index cd211a1..a209bd5 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -436,6 +436,26 @@ config GCC_PLUGIN_STRUCTLEAK_VERBOSE
>           initialized. Since not all existing initializers are detected
>           by the plugin, this can produce false positive warnings.
>
> +config ARCH_HAS_STACKLEAK
> +       def_bool n

This can just be "bool" (it will default off).

> +       help
> +         An architecture should select this if it has the code which
> +         fills the used part of the kernel stack with a poison value
> +         before returning from system calls.

Maybe specifically mention the -0xBEEF value?

> +
> +config STACKLEAK

I would follow the naming of the others, and call this GCC_PLUGIN_STACKLEAK

> +       bool "Erase the kernel stack before returning from syscalls"
> +       depends on GCC_PLUGINS
> +       depends on ARCH_HAS_STACKLEAK
> +       help
> +         This option makes the kernel erase the kernel stack before it
> +         returns from a system call. That reduces the information which
> +         a kernel stack leak bug can reveal.
> +
> +         This plugin was ported from grsecurity/PaX. More information at:
> +          * https://grsecurity.net/
> +          * https://pax.grsecurity.net/

Is Kconfig smart enough to include this in the gcc plugins sub-page
when the ARCH_HAS_STACKLEAK config is between this and the others?

> +
>  config HAVE_CC_STACKPROTECTOR
>         bool
>         help
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index cc98d5a..5988a5f 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -56,6 +56,7 @@ config X86
>         select ARCH_HAS_PMEM_API                if X86_64
>         select ARCH_HAS_SET_MEMORY
>         select ARCH_HAS_SG_CHAIN
> +       select ARCH_HAS_STACKLEAK
>         select ARCH_HAS_STRICT_KERNEL_RWX
>         select ARCH_HAS_STRICT_MODULE_RWX
>         select ARCH_HAS_UBSAN_SANITIZE_ALL
> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> index 370c42c..8ff0a84 100644
> --- a/arch/x86/entry/common.c
> +++ b/arch/x86/entry/common.c
> @@ -43,6 +43,12 @@ __visible inline void enter_from_user_mode(void)
>  static inline void enter_from_user_mode(void) {}
>  #endif
>
> +#ifdef CONFIG_STACKLEAK
> +asmlinkage void erase_kstack(void);
> +#else
> +static void erase_kstack(void) {}
> +#endif
> +
>  static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch)
>  {
>  #ifdef CONFIG_X86_64
> @@ -79,8 +85,10 @@ static long syscall_trace_enter(struct pt_regs *regs)
>                 emulated = true;
>
>         if ((emulated || (work & _TIF_SYSCALL_TRACE)) &&
> -           tracehook_report_syscall_entry(regs))
> +           tracehook_report_syscall_entry(regs)) {
> +               erase_kstack();
>                 return -1L;
> +       }
>
>         if (emulated)
>                 return -1L;
> @@ -114,9 +122,11 @@ static long syscall_trace_enter(struct pt_regs *regs)
>                         sd.args[5] = regs->bp;
>                 }
>
> -               ret = __secure_computing(&sd);
> -               if (ret == -1)
> +               ret = secure_computing(&sd);
> +               if (ret == -1) {
> +                       erase_kstack();
>                         return ret;
> +               }
>         }
>  #endif

Since this is x86-specific, maybe Cc x86@kernel.org and/or Andy
Lutomirski on follow-up versions. I'm sure they'll have opinions about
changes to the entry code. :)

It seems like it shouldn't be too hard to add on-user-return erasure
code to other architectures too.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls
  2017-06-09 17:28 ` [kernel-hardening] " Kees Cook
@ 2017-06-09 23:00   ` Alexander Popov
  2017-06-20 19:20     ` Kees Cook
  2017-06-13 21:51   ` Laura Abbott
  2017-06-23 22:48   ` [kernel-hardening] Re: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls Tycho Andersen
  2 siblings, 1 reply; 30+ messages in thread
From: Alexander Popov @ 2017-06-09 23:00 UTC (permalink / raw)
  To: Kees Cook; +Cc: kernel-hardening, PaX Team, Brad Spengler, Tycho Andersen

On 09.06.2017 20:28, Kees Cook wrote:
> On Fri, Jun 9, 2017 at 7:30 AM, Alexander Popov <alex.popov@linux.com> wrote:
>> Hello,
>>
>> My employer Positive Technologies kindly allowed me to spend a part of my
>> working time on helping KSPP. So I join the initiative of porting STACKLEAK
>> to the mainline, which was started by Tycho Andersen.
> 
> Great, thanks for working on this!

Hello Kees, thanks for your prompt reply.

>> STACKLEAK is a security feature developed by Grsecurity/PaX (kudos to them!),
>> which can mitigate the damage from kernel stack leak bugs (see CVE-2010-2963,
>> CVE-2016-4569 and CVE-2016-4578).
> 
> It might help to distinguish that this covers at least two classes of
> issues: stack content exposure (the latter two CVEs above), and it can
> block some uninitialized variable attacks (like the mentioned
> CVE-2010-2963). In other words, this can be more than just an info
> exposure defense.

Thanks, I'll improve the description in the next version.

>> I carefully extracted STACKLEAK from the last public patch of Grsecurity/PaX
>> and do my best to understand it. So I added some comments describing that
>> understanding. You are welcome to discuss it.
> 
> Great, adding more comments to the gcc plugin will help others who are
> trying to get up to speed on it (like me). :)

Currently I annotated the arch-specific asm part of the STACKLEAK feature.
However, there are a few questions in my comments marked with TODO.
Playing with the gcc plugin is the next step.

>> Here are the results of a brief performance test on x86_64:
>>
>> Hardware: Intel Core i7-4770, 16 GB RAM
>>
>> Test #1: stress-ng --cpu 4 --io 2 --vm 2 --vm-bytes 2G --timeout 300s
>>  --metrics-brief
>> Result on 4.11-rc8:
>>   cpu bogo ops 269955
>>   iosync bogo ops 9809985
>>   vm bogo ops 17093608
>> Result on 4.11-rc8+stackleak:
>>   cpu bogo ops 270106 (+0.6%)
>>   iosync bogo ops 9474535 (-3.42%)
>>   vm bogo ops 17093608 (the same)
>>
>> Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P
>> Average result on 4.11-rc8: 8.71s
>> Average result on 4.11-rc8+stackleak: 9.08s (+4.29%)
>>
>> Test #3: building the Linux kernel with Ubuntu config (time make -j9)
>> Result on 4.11-rc8:
>>   real  32m14.893s
>>   user  237m30.886s
>>   sys   11m12.273s
>> Result on 4.11-rc8+stackleak:
>>   real  32m26.881s (+0.62%)
>>   user  238m38.926s (+0.48%)
>>   sys   11m36.426s (+3.59%)
> 
> Awesome, and thanks for the benchmarks! That should really help people
> understand the trade-offs for using this feature (and is likely worth
> mentioning in the Kconfig). Seems like less than 4% overhead, maybe
> much less? Real time on build times seems like a tiny difference, but
> hackbench shows 4%.

Yes, the performance penalty of STACKLEAK differs a lot depending on the
kind of load. Do you have any idea which test can give a bigger slowdown?
It should be some rapid syscall exhausting the kernel stack hard.

>> There is an important aspect of STACKLEAK on i386. PaX Team might know
>> about it, I can't say for sure. Dear Grsecurity, is it fine to talk with
>> you about such things in that mailing list? Should I do it differently?
>>
>> The STACKLEAK gcc plugin does not instrument the kernel code for i386. I've
>> checked that for the last public patch of Grsecurity/PaX and see the same
>> behaviour on my machines. The reason: the ix86_cmodel for the Linux kernel
>> on i386 is not CM_KERNEL. So the STACKLEAK plugin seems to skip the
>> instrumentation for that platform. As a result, on i386 erase_kstack()
>> always starts to search for the bottom of the stack from the top minus 128.
> 
> What version of gcc did you use for these builds, btw? Perhaps
> something changed there?

I tried to compile the Linux kernel for i386 with two different versions of gcc:

#gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/6/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro
6.3.0-18ubuntu2~16.04' --with-bugurl=file:///usr/share/doc/gcc-6/README.Bugs
--enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-6 --program-prefix=x86_64-linux-gnu- --enable-shared
--enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext
--enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/
--enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify
--enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin
--enable-java-awt=gtk --enable-gtk-cairo
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-6-amd64/jre --enable-java-home
--with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-6-amd64
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-6-amd64 --with-arch-directory=amd64
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-target-system-zlib
--enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686
--with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic
--enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu
--target=x86_64-linux-gnu
Thread model: posix
gcc version 6.3.0 20170519 (Ubuntu/Linaro 6.3.0-18ubuntu2~16.04)

gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/i686-linux-gnu/5/lto-wrapper
Target: i686-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 5.4.0-6ubuntu1~16.04.4'
--with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs
--enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib
--without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls
--with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify
--enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin
--enable-java-awt=gtk --enable-gtk-cairo
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-i386/jre --enable-java-home
--with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-i386
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-i386 --with-arch-directory=i386
--with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-targets=all
--enable-multiarch --disable-werror --with-arch-32=i686 --with-multilib-list=m32,m64,mx32
--enable-multilib --with-tune=generic --enable-checking=release --build=i686-linux-gnu
--host=i686-linux-gnu --target=i686-linux-gnu
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)

Both of them compiled it with ix86_cmodel=0, so the STACKLEAK instrumentation
was skipped.

>> See the results of the same performance tests on i386:
>>
>> Test #1: stress-ng --cpu 4 --io 2 --vm 2 --vm-bytes 2G --timeout 300s
>>  --metrics-brief
>> Result on 4.11-rc8:
>>   cpu bogo ops 207754
>>   iosync bogo ops 9442815
>>   vm bogo ops 8546804
>> Result on 4.11-rc8+stackleak:
>>   cpu bogo ops 206061 (-0.81%)
>>   iosync bogo ops 9435139 (-0.08%)
>>   vm bogo ops 8546804 (the same)
>>
>> Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P
>> Average result on 4.11-rc8: 8.197s
>> Average result on 4.11-rc8+stackleak: 9.134s (+11.43%)
>>
>> Test #3: building the Linux kernel with Ubuntu config (time make -j9)
>> Result on 4.11-rc8:
>>   real  18m15.372s
>>   user  129m58.169s
>>   sys   8m27.884s
>> Result on 4.11-rc8+stackleak:
>>   real  18m34.244s (+1.72%)
>>   user  132m33.843s (+2.00%)
>>   sys   8m37.658s (+1.92%)
>>
>> More things to be done:
>>  - understand how the STACKLEAK gcc plugin works;
>>  - develop tests for STACKLEAK.
> 
> Since this is mostly an anti-exposure defense, LKDTM is probably not a
> good match (i.e. organizing a test for the uninit variable case can be
> very fragile). I think something similar to test_user_copy.c would be
> better.

Ok, thanks for the clue.

>> From d22af45233b2f6d657a29dcb1815b35a5a45c539 Mon Sep 17 00:00:00 2001
>> From: Alexander Popov <alex.popov@linux.com>
>> Date: Fri, 9 Jun 2017 15:21:16 +0300
>> Subject: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the
>>  kernel stack at the end of syscalls
>>
>> The stackleak feature erases the kernel stack before returning from syscalls.
>> That reduces the information which a kernel stack leak bug can reveal.
>>
>> This feature consists of:
>>   - the architecture-specific code filling the used part of the kernel stack
>>   with a poison value before returning to the userspace (currently only
>>   for i386 and x86_64);
>>   - the gcc plugin for tracking the lowest border of the kernel stack. It
>>   instruments the kernel code inserting the track_stack() function call if
>>   a stack frame size is over a specified limit.
>>
>> The stackleak feature is ported from grsecurity/PaX. For more information see:
>>   https://grsecurity.net/
>>   https://pax.grsecurity.net/
>>
>> This code is verbatim from Brad Spengler/PaX Team's code in the last public
> 
> Maybe "nearly verbatim" since Kconfigs and such were (correctly)
> renamed/expanded, and comments were added.

Agree, I'll fix it.

>> patch of grsecurity/PaX based on our understanding of the code. Changes or
>> omissions from the original code are ours and don't reflect the original
>> grsecurity/PaX code.
>>
>> Signed-off-by: Alexander Popov <alex.popov@linux.com>
>> Signed-off-by: Tycho Andersen <tycho@docker.com>
> 
> This S-o-b is not right, I think it should just be yours. Perhaps say
> something like "Continued upstreaming work started by Tycho Anderson."
> in the commit log.

Ok, excuse me.
Tycho looked at this patch and tested it. I had to use "Tested-by".

>> ---
>>  arch/Kconfig                           |  20 ++
>>  arch/x86/Kconfig                       |   1 +
>>  arch/x86/entry/common.c                |  17 +-
>>  arch/x86/entry/entry_32.S              |  71 +++++++
>>  arch/x86/entry/entry_64.S              |  99 ++++++++++
>>  arch/x86/entry/entry_64_compat.S       |   8 +
>>  arch/x86/include/asm/processor.h       |   4 +
>>  arch/x86/kernel/asm-offsets.c          |   9 +
>>  arch/x86/kernel/dumpstack_32.c         |  12 ++
>>  arch/x86/kernel/dumpstack_64.c         |  33 ++++
>>  arch/x86/kernel/process_32.c           |   5 +
>>  arch/x86/kernel/process_64.c           |   5 +
>>  fs/exec.c                              |  17 ++
>>  scripts/Makefile.gcc-plugins           |   3 +
>>  scripts/gcc-plugins/stackleak_plugin.c | 342 +++++++++++++++++++++++++++++++++
>>  15 files changed, 643 insertions(+), 3 deletions(-)
>>  create mode 100644 scripts/gcc-plugins/stackleak_plugin.c
>>
>> diff --git a/arch/Kconfig b/arch/Kconfig
>> index cd211a1..a209bd5 100644
>> --- a/arch/Kconfig
>> +++ b/arch/Kconfig
>> @@ -436,6 +436,26 @@ config GCC_PLUGIN_STRUCTLEAK_VERBOSE
>>           initialized. Since not all existing initializers are detected
>>           by the plugin, this can produce false positive warnings.
>>
>> +config ARCH_HAS_STACKLEAK
>> +       def_bool n
> 
> This can just be "bool" (it will default off).

Ok.

>> +       help
>> +         An architecture should select this if it has the code which
>> +         fills the used part of the kernel stack with a poison value
>> +         before returning from system calls.
> 
> Maybe specifically mention the -0xBEEF value?

Ok. Should I create some macro for it?

>> +
>> +config STACKLEAK
> 
> I would follow the naming of the others, and call this GCC_PLUGIN_STACKLEAK

It seems to me that GCC_PLUGIN_STACKLEAK is not a right name since the whole
feature consists of two parts: the arch-specific asm code actually cleaning
the kernel stack and the gcc plugin which helps to do it faster and more
reliable. What do you think?

>> +       bool "Erase the kernel stack before returning from syscalls"
>> +       depends on GCC_PLUGINS
>> +       depends on ARCH_HAS_STACKLEAK
>> +       help
>> +         This option makes the kernel erase the kernel stack before it
>> +         returns from a system call. That reduces the information which
>> +         a kernel stack leak bug can reveal.
>> +
>> +         This plugin was ported from grsecurity/PaX. More information at:
>> +          * https://grsecurity.net/
>> +          * https://pax.grsecurity.net/
> 
> Is Kconfig smart enough to include this in the gcc plugins sub-page
> when the ARCH_HAS_STACKLEAK config is between this and the others?

You are right, it's outside gcc plugins subpage. If we finally choose the
name GCC_PLUGIN_STACKLEAK, I'll put it inside.

>> +
>>  config HAVE_CC_STACKPROTECTOR
>>         bool
>>         help
>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>> index cc98d5a..5988a5f 100644
>> --- a/arch/x86/Kconfig
>> +++ b/arch/x86/Kconfig
>> @@ -56,6 +56,7 @@ config X86
>>         select ARCH_HAS_PMEM_API                if X86_64
>>         select ARCH_HAS_SET_MEMORY
>>         select ARCH_HAS_SG_CHAIN
>> +       select ARCH_HAS_STACKLEAK
>>         select ARCH_HAS_STRICT_KERNEL_RWX
>>         select ARCH_HAS_STRICT_MODULE_RWX
>>         select ARCH_HAS_UBSAN_SANITIZE_ALL
>> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
>> index 370c42c..8ff0a84 100644
>> --- a/arch/x86/entry/common.c
>> +++ b/arch/x86/entry/common.c
>> @@ -43,6 +43,12 @@ __visible inline void enter_from_user_mode(void)
>>  static inline void enter_from_user_mode(void) {}
>>  #endif
>>
>> +#ifdef CONFIG_STACKLEAK
>> +asmlinkage void erase_kstack(void);
>> +#else
>> +static void erase_kstack(void) {}
>> +#endif
>> +
>>  static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch)
>>  {
>>  #ifdef CONFIG_X86_64
>> @@ -79,8 +85,10 @@ static long syscall_trace_enter(struct pt_regs *regs)
>>                 emulated = true;
>>
>>         if ((emulated || (work & _TIF_SYSCALL_TRACE)) &&
>> -           tracehook_report_syscall_entry(regs))
>> +           tracehook_report_syscall_entry(regs)) {
>> +               erase_kstack();
>>                 return -1L;
>> +       }
>>
>>         if (emulated)
>>                 return -1L;
>> @@ -114,9 +122,11 @@ static long syscall_trace_enter(struct pt_regs *regs)
>>                         sd.args[5] = regs->bp;
>>                 }
>>
>> -               ret = __secure_computing(&sd);
>> -               if (ret == -1)
>> +               ret = secure_computing(&sd);
>> +               if (ret == -1) {
>> +                       erase_kstack();
>>                         return ret;
>> +               }
>>         }
>>  #endif
> 
> Since this is x86-specific, maybe Cc x86@kernel.org and/or Andy
> Lutomirski on follow-up versions. I'm sure they'll have opinions about
> changes to the entry code. :)

Thanks! Of course, I'll do it for the next version.

> It seems like it shouldn't be too hard to add on-user-return erasure
> code to other architectures too.

Yes, maybe. At the same time the gcc plugin might somehow depend on x86.
I'll investigate that.

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [kernel-hardening] Re: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls
  2017-06-09 17:28 ` [kernel-hardening] " Kees Cook
  2017-06-09 23:00   ` Alexander Popov
@ 2017-06-13 21:51   ` Laura Abbott
  2017-06-20 11:20     ` Mark Rutland
  2017-06-23 22:48   ` [kernel-hardening] Re: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls Tycho Andersen
  2 siblings, 1 reply; 30+ messages in thread
From: Laura Abbott @ 2017-06-13 21:51 UTC (permalink / raw)
  To: Kees Cook, Alexander Popov
  Cc: kernel-hardening, PaX Team, Brad Spengler, Tycho Andersen

On 06/09/2017 10:28 AM, Kees Cook wrote:
> It seems like it shouldn't be too hard to add on-user-return erasure
> code to other architectures too.

I played around getting this to compile for arm64 with a dummy
stack clearing function. arm64 is doing something special with the
efistub so it fails to link with

drivers/firmware/efi/libstub/arm-stub.c:45:(.init.text+0x54): relocation
	truncated to fit: R_AARCH64_CALL26 against undefined symbol `__efistub_track_stack'

The relocation to the .init.text section and appending __efistub happens after
compilation so the checks in the plugin itself don't work. I haven't come up
with a solution to not have the plugin run on the stub yet.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [kernel-hardening] [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls
  2017-06-09 14:30 [kernel-hardening] [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls Alexander Popov
  2017-06-09 17:28 ` [kernel-hardening] " Kees Cook
@ 2017-06-20  9:06 ` Hector Martin "marcan"
  2017-06-20 19:07   ` Kees Cook
  2017-06-20 19:14 ` [kernel-hardening] " Kees Cook
  2 siblings, 1 reply; 30+ messages in thread
From: Hector Martin "marcan" @ 2017-06-20  9:06 UTC (permalink / raw)
  To: Alexander Popov, kernel-hardening, keescook, pageexec, spender, tycho

On 09/06/17 23:30, Alexander Popov wrote:
> +/*
> + * Copyright 2011-2017 by the PaX Team <pageexec@freemail.hu>
> + * Licensed under the GPL v2
> + *
> + * Note: the choice of the license means that the compilation process is
> + *       NOT 'eligible' as defined by gcc's library exception to the GPL v3,
> + *       but for the kernel it doesn't matter since it doesn't link against
> + *       any of the gcc libraries

I know this isn't the first time I've mentioned this, but I think it 
bears repeating that, as far as I understand the licensing (IANAL), GCC 
plug-ins licensed as GPLv2 are in violation of the GCC license.

See [1], which says that this plug-in and GCC form a "single combined 
program", and [2], which says that in this case the plug-in must be 
GPL-compatible. Though not specified in this informal FAQ entry, for 
GPLv3-licensed GCC, this obviously means GPLv3-compatible, and the GPLv2 
is *not* GPLv3-compatible ([3], especially the second paragraph).

In addition, orthogonally, the motivation behind this licensing 
(preventing userspace code from using these plug-ins through a gross 
abuse of the anti-proprietary-plugin licensing that libgcc uses) also 
prevents them from being used with some kernel architectures that *do* 
link against libgcc.

This can all be fixed by requesting that the plug-ins be relicensed as 
"GPLv2 or later", although evidently this goes against the interests of 
the original author (PaX team) of restricting usage of these plug-ins, 
so I doubt they will do that.

Alternatively, the plug-in can be rewritten to operate on an *extracted* 
intermediate representation of the GCC internal state, with a 
GPLv3-compatible shim used as a GCC plug-in to extract said state, and 
the actual core of the transformation done in a separate binary that 
does not link with GCC and may use any license. This approach is what 
PaX team *should* have done from the beginning in order to use their 
anti-userspace licensing hack without violating any licenses, IMO. This 
would still disallow the plug-in from being used on kernel architectures 
that do link libgcc, and on userspace code that does the same.

Failing those, the only remaining option I see is a clean rewrite of the 
plug-ins without using any of the original code, and licensing that as 
GPLv2+.

[1] https://www.gnu.org/licenses/gpl-faq.en.html#GPLPlugins
[2] https://www.gnu.org/licenses/gpl-faq.en.html#GPLAndPlugins
[3] https://www.gnu.org/licenses/gpl-faq.en.html#v2v3Compatibility

-- 
Hector Martin "marcan"
Public key: https://mrcn.st/pub

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [kernel-hardening] Re: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls
  2017-06-13 21:51   ` Laura Abbott
@ 2017-06-20 11:20     ` Mark Rutland
  2017-06-20 14:13       ` Ard Biesheuvel
  2017-06-20 19:11       ` Kees Cook
  0 siblings, 2 replies; 30+ messages in thread
From: Mark Rutland @ 2017-06-20 11:20 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Kees Cook, Alexander Popov, kernel-hardening, PaX Team,
	Brad Spengler, Tycho Andersen, ard.biesheuvel

On Tue, Jun 13, 2017 at 02:51:59PM -0700, Laura Abbott wrote:
> On 06/09/2017 10:28 AM, Kees Cook wrote:
> > It seems like it shouldn't be too hard to add on-user-return erasure
> > code to other architectures too.
> 
> I played around getting this to compile for arm64 with a dummy
> stack clearing function. arm64 is doing something special with the
> efistub so it fails to link with
> 
> drivers/firmware/efi/libstub/arm-stub.c:45:(.init.text+0x54): relocation
> 	truncated to fit: R_AARCH64_CALL26 against undefined symbol `__efistub_track_stack'
> 
> The relocation to the .init.text section and appending __efistub happens after
> compilation so the checks in the plugin itself don't work. I haven't come up
> with a solution to not have the plugin run on the stub yet.

Can we do something like what we do for KCOV, and (only) place the
plugin-invoking flags to into CFLAGS_STACKLEAK, which we can filter out
in scripts/Makefile.lib?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [kernel-hardening] Re: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls
  2017-06-20 11:20     ` Mark Rutland
@ 2017-06-20 14:13       ` Ard Biesheuvel
  2017-06-20 19:11       ` Kees Cook
  1 sibling, 0 replies; 30+ messages in thread
From: Ard Biesheuvel @ 2017-06-20 14:13 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Laura Abbott, Kees Cook, Alexander Popov, kernel-hardening,
	PaX Team, Brad Spengler, Tycho Andersen

On 20 June 2017 at 13:20, Mark Rutland <mark.rutland@arm.com> wrote:
> On Tue, Jun 13, 2017 at 02:51:59PM -0700, Laura Abbott wrote:
>> On 06/09/2017 10:28 AM, Kees Cook wrote:
>> > It seems like it shouldn't be too hard to add on-user-return erasure
>> > code to other architectures too.
>>
>> I played around getting this to compile for arm64 with a dummy
>> stack clearing function. arm64 is doing something special with the
>> efistub so it fails to link with
>>
>> drivers/firmware/efi/libstub/arm-stub.c:45:(.init.text+0x54): relocation
>>       truncated to fit: R_AARCH64_CALL26 against undefined symbol `__efistub_track_stack'
>>
>> The relocation to the .init.text section and appending __efistub happens after
>> compilation so the checks in the plugin itself don't work. I haven't come up
>> with a solution to not have the plugin run on the stub yet.
>
> Can we do something like what we do for KCOV, and (only) place the
> plugin-invoking flags to into CFLAGS_STACKLEAK, which we can filter out
> in scripts/Makefile.lib?
>

I agree. This instrumentation serves no purpose in the EFI stub, given
that there are no syscalls to return from, and so it would be better
if we could prevent the plugin from seeing these source files to begin
with. In fact, I could imagine that there are certain hot paths in
kthreads that would benefit from having this disabled as well.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [kernel-hardening] [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls
  2017-06-20  9:06 ` [kernel-hardening] " Hector Martin "marcan"
@ 2017-06-20 19:07   ` Kees Cook
  2017-06-20 20:22     ` Hector Martin "marcan"
  0 siblings, 1 reply; 30+ messages in thread
From: Kees Cook @ 2017-06-20 19:07 UTC (permalink / raw)
  To: Hector Martin marcan
  Cc: Alexander Popov, kernel-hardening, PaX Team, Brad Spengler,
	Tycho Andersen

On Tue, Jun 20, 2017 at 2:06 AM, Hector Martin "marcan"
<marcan@marcan.st> wrote:
> On 09/06/17 23:30, Alexander Popov wrote:
>>
>> +/*
>> + * Copyright 2011-2017 by the PaX Team <pageexec@freemail.hu>
>> + * Licensed under the GPL v2
>> + *
>> + * Note: the choice of the license means that the compilation process is
>> + *       NOT 'eligible' as defined by gcc's library exception to the GPL
>> v3,
>> + *       but for the kernel it doesn't matter since it doesn't link
>> against
>> + *       any of the gcc libraries
>
>
> I know this isn't the first time I've mentioned this, but I think it bears
> repeating that, as far as I understand the licensing (IANAL), GCC plug-ins
> licensed as GPLv2 are in violation of the GCC license.

I continue to think that the wording is unclear and that rational
people can come to different conclusions. To guide me I try to look at
the intent of the license ("stay Free Software") and the decisions of
the author ("this should be GPLv2"). Since the authors are quite clear
about their rationale, and I don't see anything that appears to be
trying to dodge the plugin being Free Software, I stand by my earlier
conclusion[1].

-Kees

[1] http://www.openwall.com/lists/kernel-hardening/2016/05/31/1

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [kernel-hardening] Re: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls
  2017-06-20 11:20     ` Mark Rutland
  2017-06-20 14:13       ` Ard Biesheuvel
@ 2017-06-20 19:11       ` Kees Cook
  2017-06-21  9:24         ` Mark Rutland
  1 sibling, 1 reply; 30+ messages in thread
From: Kees Cook @ 2017-06-20 19:11 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Laura Abbott, Alexander Popov, kernel-hardening, PaX Team,
	Brad Spengler, Tycho Andersen, Ard Biesheuvel

On Tue, Jun 20, 2017 at 4:20 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> On Tue, Jun 13, 2017 at 02:51:59PM -0700, Laura Abbott wrote:
>> On 06/09/2017 10:28 AM, Kees Cook wrote:
>> > It seems like it shouldn't be too hard to add on-user-return erasure
>> > code to other architectures too.
>>
>> I played around getting this to compile for arm64 with a dummy
>> stack clearing function. arm64 is doing something special with the
>> efistub so it fails to link with
>>
>> drivers/firmware/efi/libstub/arm-stub.c:45:(.init.text+0x54): relocation
>>       truncated to fit: R_AARCH64_CALL26 against undefined symbol `__efistub_track_stack'
>>
>> The relocation to the .init.text section and appending __efistub happens after
>> compilation so the checks in the plugin itself don't work. I haven't come up
>> with a solution to not have the plugin run on the stub yet.
>
> Can we do something like what we do for KCOV, and (only) place the
> plugin-invoking flags to into CFLAGS_STACKLEAK, which we can filter out
> in scripts/Makefile.lib?

That seems fine to me, though plugins have tended to provide a
DISABLE_foo_bar_PLUGIN export (like used in ppc for turning off the
latent entropy plugin):

scripts/Makefile.gcc-plugins:    DISABLE_LATENT_ENTROPY_PLUGIN +=
-fplugin-arg-latent_entropy_plugin-disable

arch/powerpc/kernel/Makefile:CFLAGS_cputable.o +=
$(DISABLE_LATENT_ENTROPY_PLUGIN)
arch/powerpc/kernel/Makefile:CFLAGS_prom_init.o +=
$(DISABLE_LATENT_ENTROPY_PLUGIN)
...


-Kees


-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls
  2017-06-09 14:30 [kernel-hardening] [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls Alexander Popov
  2017-06-09 17:28 ` [kernel-hardening] " Kees Cook
  2017-06-20  9:06 ` [kernel-hardening] " Hector Martin "marcan"
@ 2017-06-20 19:14 ` Kees Cook
  2 siblings, 0 replies; 30+ messages in thread
From: Kees Cook @ 2017-06-20 19:14 UTC (permalink / raw)
  To: Alexander Popov; +Cc: kernel-hardening, PaX Team, Brad Spengler, Tycho Andersen

On Fri, Jun 9, 2017 at 7:30 AM, Alexander Popov <alex.popov@linux.com> wrote:
> diff --git a/scripts/gcc-plugins/stackleak_plugin.c b/scripts/gcc-plugins/stackleak_plugin.c
> new file mode 100644
> index 0000000..2ee49c4
> --- /dev/null
> +++ b/scripts/gcc-plugins/stackleak_plugin.c
> [...]
> + * TODO:
> + * - initialize all local variables
> [...]
> +static bool init_locals;
> +
> +static struct plugin_info stackleak_plugin_info = {
> +       .version        = "201602181345",
> +       .help           = "track-lowest-sp=nn\ttrack sp in functions whose frame size is at least nn bytes\n"
> +//                       "initialize-locals\t\tforcibly initialize all stack frames\n"
> +};
> [...]
> +       for (i = 0; i < argc; ++i) {
> +               if (!strcmp(argv[i].key, "track-lowest-sp")) {
> +                       if (!argv[i].value) {
> +                               error(G_("no value supplied for option '-fplugin-arg-%s-%s'"), plugin_name, argv[i].key);
> +                               continue;
> +                       }
> +                       track_frame_size = atoi(argv[i].value);
> +                       if (argv[i].value[0] < '0' || argv[i].value[0] > '9' || track_frame_size < 0)
> +                               error(G_("invalid option argument '-fplugin-arg-%s-%s=%s'"), plugin_name, argv[i].key, argv[i].value);
> +                       continue;
> +               }
> +               if (!strcmp(argv[i].key, "initialize-locals")) {
> +                       if (argv[i].value) {
> +                               error(G_("invalid option argument '-fplugin-arg-%s-%s=%s'"), plugin_name, argv[i].key, argv[i].value);
> +                               continue;
> +                       }
> +                       init_locals = true;
> +                       continue;
> +               }
> +               error(G_("unknown option '-fplugin-arg-%s-%s'"), plugin_name, argv[i].key);
> +       }

Just noticed this: there is an undocumented "initialize-locals" plugin
argument that set an unused init_locals variables. (And it's
unused-ness matches the TODO at the top.)

Probably the unused arg/var should be dropped (or better yet:
implemented to do the initialization).

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls
  2017-06-09 23:00   ` Alexander Popov
@ 2017-06-20 19:20     ` Kees Cook
  0 siblings, 0 replies; 30+ messages in thread
From: Kees Cook @ 2017-06-20 19:20 UTC (permalink / raw)
  To: Alexander Popov; +Cc: kernel-hardening, PaX Team, Brad Spengler, Tycho Andersen

On Fri, Jun 9, 2017 at 4:00 PM, Alexander Popov <alex.popov@linux.com> wrote:
> On 09.06.2017 20:28, Kees Cook wrote:
>> Awesome, and thanks for the benchmarks! That should really help people
>> understand the trade-offs for using this feature (and is likely worth
>> mentioning in the Kconfig). Seems like less than 4% overhead, maybe
>> much less? Real time on build times seems like a tiny difference, but
>> hackbench shows 4%.
>
> Yes, the performance penalty of STACKLEAK differs a lot depending on the
> kind of load. Do you have any idea which test can give a bigger slowdown?
> It should be some rapid syscall exhausting the kernel stack hard.

I can't think of anything off the top of my head. You could play with
CONFIG_FRAME_WARN[1] and related tools to find a deep call path and
try that?

[1] http://elinux.org/Kernel_Small_Stacks

>> Maybe specifically mention the -0xBEEF value?
>
> Ok. Should I create some macro for it?

Maybe? It's not really clear how useful that might be. If it's easy,
then yeah, use a common macro for the value, if it creates header
soup, leave it open-coded.

>> I would follow the naming of the others, and call this GCC_PLUGIN_STACKLEAK
>
> It seems to me that GCC_PLUGIN_STACKLEAK is not a right name since the whole
> feature consists of two parts: the arch-specific asm code actually cleaning
> the kernel stack and the gcc plugin which helps to do it faster and more
> reliable. What do you think?

It looks like the feature requires the plugin, so I think the common
naming (GCC_PLUGIN_STACKLEAK) would be preferred. But perhaps I'm
overlooking something where the plugin is not used?

Thanks!

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [kernel-hardening] [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls
  2017-06-20 19:07   ` Kees Cook
@ 2017-06-20 20:22     ` Hector Martin "marcan"
  0 siblings, 0 replies; 30+ messages in thread
From: Hector Martin "marcan" @ 2017-06-20 20:22 UTC (permalink / raw)
  To: Kees Cook
  Cc: Alexander Popov, kernel-hardening, PaX Team, Brad Spengler,
	Tycho Andersen

On 2017-06-21 04:07, Kees Cook wrote:
> On Tue, Jun 20, 2017 at 2:06 AM, Hector Martin "marcan"
> <marcan@marcan.st> wrote:
>> I know this isn't the first time I've mentioned this, but I think it bears
>> repeating that, as far as I understand the licensing (IANAL), GCC plug-ins
>> licensed as GPLv2 are in violation of the GCC license.
> 
> I continue to think that the wording is unclear and that rational
> people can come to different conclusions. To guide me I try to look at
> the intent of the license ("stay Free Software") and the decisions of
> the author ("this should be GPLv2"). Since the authors are quite clear
> about their rationale, and I don't see anything that appears to be
> trying to dodge the plugin being Free Software, I stand by my earlier
> conclusion[1].

I think this is a somewhat naive view. Even if we ignore the letter of
the license (which I do believe we shouldn't, to keep us out of legal
hot water) and focus on the spirit/intent, what is going on here has a
seriously strange smell.

For one, the GPLv2 license has been chosen *explicitly* to re-target a
licensing scheme intended to prevent usage of GCC with non-free plugins
to, instead, limit usage of the plug-in to only certain kinds of
software (namely the kernel). This was done in order to effectively use
the free plug-in as an advertisement for a (AIUI) GPLv3-licensed "full"
version that is (AIUI) distributed under a wrapper contract that
attempts to discourage redistribution (i.e. exercising the user's rights
under the license) under threat of contract discontinuation. Whether
this is legal or not (and I believe it is, at least this part of the
story), it's certainly not what the GCC authors had in mind when they
developed the plug-in interface and very much violates the intent of the
libgcc exception mechanism (which is just to discourage proprietary
plug-ins).

Additionally, the simple fact that the plug-in is not GPLv3-compatible
means it cannot be upstreamed into GCC itself. This prevents GCC itself
from benefiting from this usage of its plug-in API, and is also clearly
not what the GCC authors intend.

Ultimately, the GCC authors laid out a simple rule with their license
choice: "keep plug-ins GPLv3-compatible", and this plug-in violates that
rule - worse, does so in order to explicitly limit users' freedom. I
think it's easy to argue that this violates the spirit of free software
and the intent of the FSF.

For what it's worth, I did ask a GCC developer and he agrees with my
analysis (though obviously this is a single data point and does not
represent the view of the FSF's lawyers).

-- 
Hector Martin "marcan" (marcan@marcan.st)
Public Key: https://mrcn.st/pub

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [kernel-hardening] Re: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls
  2017-06-20 19:11       ` Kees Cook
@ 2017-06-21  9:24         ` Mark Rutland
  2017-06-21 15:54           ` Laura Abbott
  0 siblings, 1 reply; 30+ messages in thread
From: Mark Rutland @ 2017-06-21  9:24 UTC (permalink / raw)
  To: Kees Cook
  Cc: Laura Abbott, Alexander Popov, kernel-hardening, PaX Team,
	Brad Spengler, Tycho Andersen, Ard Biesheuvel

On Tue, Jun 20, 2017 at 12:11:56PM -0700, Kees Cook wrote:
> On Tue, Jun 20, 2017 at 4:20 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> > On Tue, Jun 13, 2017 at 02:51:59PM -0700, Laura Abbott wrote:
> >> On 06/09/2017 10:28 AM, Kees Cook wrote:
> >> > It seems like it shouldn't be too hard to add on-user-return erasure
> >> > code to other architectures too.
> >>
> >> I played around getting this to compile for arm64 with a dummy
> >> stack clearing function. arm64 is doing something special with the
> >> efistub so it fails to link with
> >>
> >> drivers/firmware/efi/libstub/arm-stub.c:45:(.init.text+0x54): relocation
> >>       truncated to fit: R_AARCH64_CALL26 against undefined symbol `__efistub_track_stack'
> >>
> >> The relocation to the .init.text section and appending __efistub happens after
> >> compilation so the checks in the plugin itself don't work. I haven't come up
> >> with a solution to not have the plugin run on the stub yet.
> >
> > Can we do something like what we do for KCOV, and (only) place the
> > plugin-invoking flags to into CFLAGS_STACKLEAK, which we can filter out
> > in scripts/Makefile.lib?
> 
> That seems fine to me, though plugins have tended to provide a
> DISABLE_foo_bar_PLUGIN export (like used in ppc for turning off the
> latent entropy plugin):
> 
> scripts/Makefile.gcc-plugins:    DISABLE_LATENT_ENTROPY_PLUGIN +=
> -fplugin-arg-latent_entropy_plugin-disable
> 
> arch/powerpc/kernel/Makefile:CFLAGS_cputable.o +=
> $(DISABLE_LATENT_ENTROPY_PLUGIN)
> arch/powerpc/kernel/Makefile:CFLAGS_prom_init.o +=
> $(DISABLE_LATENT_ENTROPY_PLUGIN)
> ...

Interesting. I hadn't encountered that style before.

I'd seen the per-makefile config disables:

drivers/firmware/efi/libstub/Makefile:GCOV_PROFILE                      := n
drivers/firmware/efi/libstub/Makefile:KASAN_SANITIZE                    := n
drivers/firmware/efi/libstub/Makefile:UBSAN_SANITIZE                    := n
drivers/firmware/efi/libstub/Makefile:KCOV_INSTRUMENT                   := n

... and per-object config disables:

arch/arm64/mm/Makefile:KASAN_SANITIZE_physaddr.o        += n
arch/arm64/mm/Makefile:KASAN_SANITIZE_kasan_init.o      := n

... which AFAICT are both handled by the standard scripts/Makefile.lib idiom:

ifeq ($(CONFIG_KASAN),y)
_c_flags += $(if $(patsubst n%,, \
                $(KASAN_SANITIZE_$(basetarget).o)$(KASAN_SANITIZE)y), \
                $(CFLAGS_KASAN))
endif

... it seems a shame for plugins to follow a different pattern.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [kernel-hardening] Re: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls
  2017-06-21  9:24         ` Mark Rutland
@ 2017-06-21 15:54           ` Laura Abbott
  2017-07-10 22:04             ` [kernel-hardening] [RFC][PATCH 0/2] draft of stack clearing for arm64 Laura Abbott
  0 siblings, 1 reply; 30+ messages in thread
From: Laura Abbott @ 2017-06-21 15:54 UTC (permalink / raw)
  To: Mark Rutland, Kees Cook
  Cc: Alexander Popov, kernel-hardening, PaX Team, Brad Spengler,
	Tycho Andersen, Ard Biesheuvel

On 06/21/2017 02:24 AM, Mark Rutland wrote:
> On Tue, Jun 20, 2017 at 12:11:56PM -0700, Kees Cook wrote:
>> On Tue, Jun 20, 2017 at 4:20 AM, Mark Rutland <mark.rutland@arm.com> wrote:
>>> On Tue, Jun 13, 2017 at 02:51:59PM -0700, Laura Abbott wrote:
>>>> On 06/09/2017 10:28 AM, Kees Cook wrote:
>>>>> It seems like it shouldn't be too hard to add on-user-return erasure
>>>>> code to other architectures too.
>>>>
>>>> I played around getting this to compile for arm64 with a dummy
>>>> stack clearing function. arm64 is doing something special with the
>>>> efistub so it fails to link with
>>>>
>>>> drivers/firmware/efi/libstub/arm-stub.c:45:(.init.text+0x54): relocation
>>>>       truncated to fit: R_AARCH64_CALL26 against undefined symbol `__efistub_track_stack'
>>>>
>>>> The relocation to the .init.text section and appending __efistub happens after
>>>> compilation so the checks in the plugin itself don't work. I haven't come up
>>>> with a solution to not have the plugin run on the stub yet.
>>>
>>> Can we do something like what we do for KCOV, and (only) place the
>>> plugin-invoking flags to into CFLAGS_STACKLEAK, which we can filter out
>>> in scripts/Makefile.lib?
>>
>> That seems fine to me, though plugins have tended to provide a
>> DISABLE_foo_bar_PLUGIN export (like used in ppc for turning off the
>> latent entropy plugin):
>>
>> scripts/Makefile.gcc-plugins:    DISABLE_LATENT_ENTROPY_PLUGIN +=
>> -fplugin-arg-latent_entropy_plugin-disable
>>
>> arch/powerpc/kernel/Makefile:CFLAGS_cputable.o +=
>> $(DISABLE_LATENT_ENTROPY_PLUGIN)
>> arch/powerpc/kernel/Makefile:CFLAGS_prom_init.o +=
>> $(DISABLE_LATENT_ENTROPY_PLUGIN)
>> ...
> 
> Interesting. I hadn't encountered that style before.
> 
> I'd seen the per-makefile config disables:
> 
> drivers/firmware/efi/libstub/Makefile:GCOV_PROFILE                      := n
> drivers/firmware/efi/libstub/Makefile:KASAN_SANITIZE                    := n
> drivers/firmware/efi/libstub/Makefile:UBSAN_SANITIZE                    := n
> drivers/firmware/efi/libstub/Makefile:KCOV_INSTRUMENT                   := n
> 
> ... and per-object config disables:
> 
> arch/arm64/mm/Makefile:KASAN_SANITIZE_physaddr.o        += n
> arch/arm64/mm/Makefile:KASAN_SANITIZE_kasan_init.o      := n
> 
> ... which AFAICT are both handled by the standard scripts/Makefile.lib idiom:
> 
> ifeq ($(CONFIG_KASAN),y)
> _c_flags += $(if $(patsubst n%,, \
>                 $(KASAN_SANITIZE_$(basetarget).o)$(KASAN_SANITIZE)y), \
>                 $(CFLAGS_KASAN))
> endif
> 
> ... it seems a shame for plugins to follow a different pattern.
> 
> Thanks,
> Mark.
> 

That's what I tried to do as a first attempt but I got bogged down
in Makefiles and figuring out how to not pass the plugin flags to
specific files. Agreed it would be good to have the plugins follow
a similar pattern.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls
  2017-06-09 17:28 ` [kernel-hardening] " Kees Cook
  2017-06-09 23:00   ` Alexander Popov
  2017-06-13 21:51   ` Laura Abbott
@ 2017-06-23 22:48   ` Tycho Andersen
  2017-06-29 21:33     ` Kees Cook
  2 siblings, 1 reply; 30+ messages in thread
From: Tycho Andersen @ 2017-06-23 22:48 UTC (permalink / raw)
  To: Kees Cook; +Cc: Alexander Popov, kernel-hardening, PaX Team, Brad Spengler

Hi Kees,

On Fri, Jun 09, 2017 at 10:28:39AM -0700, Kees Cook wrote:
> Since this is mostly an anti-exposure defense, LKDTM is probably not a
> good match (i.e. organizing a test for the uninit variable case can be
> very fragile). I think something similar to test_user_copy.c would be
> better.

I think parts of it make sense, e.g. testing that the BUG() in
check_alloca() is hit (see the patch below). It would be nice to do
some end-to-end testing of a syscall on this, though. For that to work
in a kernel module, we'd need to be able to execute a syscall, which
I've not been able to get to work (but also seems... strange).

One option is to write a kernel module that exposes some device that
we could do an ioctl(fd, IO_CHECK_STACK_POISON, pid) or something to
check it, but it's not clear how to fit this into the kernel's current
testing infrastructure.

Thoughts?

Thanks,

Tycho


>From 1a5013cdc8f1520a0b220fe92047817a68e0be21 Mon Sep 17 00:00:00 2001
From: Tycho Andersen <tycho@docker.com>
Date: Thu, 8 Jun 2017 12:43:07 -0600
Subject: [PATCH] lkdtm: add a test for STACKLEAK plugin

This test does two things: it checks that the current syscall's stack (i.e.
the call that's loading the module) is poisoned correctly and then checks
that an alloca that will be too large causes a BUG().

Ideally we'd be able to check end-to-end that a syscall results in an
entirely poisoned stack, but I'm not sure how to do a syscall from lkdtm.

Signed-off-by: Tycho Andersen <tycho@docker.com>
---
 drivers/misc/Makefile          |  1 +
 drivers/misc/lkdtm.h           |  3 ++
 drivers/misc/lkdtm_core.c      |  1 +
 drivers/misc/lkdtm_stackleak.c | 79 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 84 insertions(+)

diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 81ef3e67acc9..805e4f06011a 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -61,6 +61,7 @@ lkdtm-$(CONFIG_LKDTM)		+= lkdtm_heap.o
 lkdtm-$(CONFIG_LKDTM)		+= lkdtm_perms.o
 lkdtm-$(CONFIG_LKDTM)		+= lkdtm_rodata_objcopy.o
 lkdtm-$(CONFIG_LKDTM)		+= lkdtm_usercopy.o
+lkdtm-$(CONFIG_LKDTM)		+= lkdtm_stackleak.o
 
 KCOV_INSTRUMENT_lkdtm_rodata.o	:= n
 
diff --git a/drivers/misc/lkdtm.h b/drivers/misc/lkdtm.h
index 3b4976396ec4..f497c3df1d44 100644
--- a/drivers/misc/lkdtm.h
+++ b/drivers/misc/lkdtm.h
@@ -64,4 +64,7 @@ void lkdtm_USERCOPY_STACK_FRAME_FROM(void);
 void lkdtm_USERCOPY_STACK_BEYOND(void);
 void lkdtm_USERCOPY_KERNEL(void);
 
+/* lkdtm_stackleak.c */
+void lkdtm_CHECK_STACKLEAK(void);
+
 #endif
diff --git a/drivers/misc/lkdtm_core.c b/drivers/misc/lkdtm_core.c
index 42d2b8e31e6b..0808bf1b37a8 100644
--- a/drivers/misc/lkdtm_core.c
+++ b/drivers/misc/lkdtm_core.c
@@ -235,6 +235,7 @@ struct crashtype crashtypes[] = {
 	CRASHTYPE(USERCOPY_STACK_FRAME_FROM),
 	CRASHTYPE(USERCOPY_STACK_BEYOND),
 	CRASHTYPE(USERCOPY_KERNEL),
+	CRASHTYPE(CHECK_STACKLEAK),
 };
 
 
diff --git a/drivers/misc/lkdtm_stackleak.c b/drivers/misc/lkdtm_stackleak.c
new file mode 100644
index 000000000000..6c343be488db
--- /dev/null
+++ b/drivers/misc/lkdtm_stackleak.c
@@ -0,0 +1,79 @@
+#include "lkdtm.h"
+
+#include <linux/sched.h>
+
+static bool check_poison(unsigned long *ptr, unsigned long n)
+{
+	unsigned long i;
+
+	for (i = 1; i < n; i++) {
+		pr_info("%lu %p: %lx\n", i, ptr-i, *(ptr - i));
+		if (*(ptr - i) != -0xbeefL)
+			return false;
+	}
+
+	return true;
+}
+
+static bool check_my_stack(void)
+{
+	char *lowest;
+	unsigned long i, left, check;
+
+	lowest = (char *) (&i + 1);
+	if (current->thread.lowest_stack < (unsigned long) lowest)
+		lowest = (char *) current->thread.lowest_stack;
+
+	left = ((unsigned long) lowest) % THREAD_SIZE;
+
+	/* See note in arch/x86/entry/entry_64.S about the or. */
+	left = left - 2 * sizeof(unsigned long);
+
+	for (i = 0; i < left; i++) {
+		unsigned long *cur = (void *) lowest - i;
+
+		if (*cur == -0xbeefL &&
+				(left - i < 16 || check_poison(cur, 16)))
+			break;
+	}
+
+	if ((left - i) % sizeof(unsigned long))
+		pr_warn("found unaligned stack poison?\n");
+
+	check = (left - i) / sizeof(unsigned long);
+	if (check_poison((unsigned long *) (lowest - i), check))
+		pr_info("current stack poisoned correctly\n");
+	else
+		pr_err("current stack not poisoned correctly\n");
+
+	return true;
+}
+
+static noinline bool do_alloca(unsigned long size)
+{
+	char buf[size];
+
+	/* so this doesn't get inlined or optimized out */
+	snprintf(buf, size, "hello world\n");
+	return true;
+}
+
+static void big_alloca(void)
+{
+	char base;
+	unsigned long left;
+
+	left = ((unsigned long) &base) % THREAD_SIZE;
+
+	pr_info("attempting large alloca of %lu\n", left);
+	do_alloca(left);
+	pr_warn("alloca succeded?\n");
+}
+
+void lkdtm_CHECK_STACKLEAK(void)
+{
+	if (!check_my_stack())
+		return;
+
+	big_alloca();
+}
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls
  2017-06-23 22:48   ` [kernel-hardening] Re: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls Tycho Andersen
@ 2017-06-29 21:33     ` Kees Cook
  2017-06-29 22:13       ` Tycho Andersen
  0 siblings, 1 reply; 30+ messages in thread
From: Kees Cook @ 2017-06-29 21:33 UTC (permalink / raw)
  To: Tycho Andersen; +Cc: Alexander Popov, kernel-hardening, PaX Team, Brad Spengler

On Fri, Jun 23, 2017 at 3:48 PM, Tycho Andersen <tycho@docker.com> wrote:
> On Fri, Jun 09, 2017 at 10:28:39AM -0700, Kees Cook wrote:
>> Since this is mostly an anti-exposure defense, LKDTM is probably not a
>> good match (i.e. organizing a test for the uninit variable case can be
>> very fragile). I think something similar to test_user_copy.c would be
>> better.
>
> I think parts of it make sense, e.g. testing that the BUG() in
> check_alloca() is hit (see the patch below). It would be nice to do
> some end-to-end testing of a syscall on this, though. For that to work
> in a kernel module, we'd need to be able to execute a syscall, which
> I've not been able to get to work (but also seems... strange).

Yeah, that portion could be done (as in your patch here).

> One option is to write a kernel module that exposes some device that
> we could do an ioctl(fd, IO_CHECK_STACK_POISON, pid) or something to
> check it, but it's not clear how to fit this into the kernel's current
> testing infrastructure.

Hmm, so one pid would do deep syscall, return and then spin in
userspace so its stack could be examined by another thread? I'm
lacking imagination about how to wire that kind of thing up, though,
as you say.

> From 1a5013cdc8f1520a0b220fe92047817a68e0be21 Mon Sep 17 00:00:00 2001
> From: Tycho Andersen <tycho@docker.com>
> Date: Thu, 8 Jun 2017 12:43:07 -0600
> Subject: [PATCH] lkdtm: add a test for STACKLEAK plugin
>
> This test does two things: it checks that the current syscall's stack (i.e.
> the call that's loading the module) is poisoned correctly and then checks
> that an alloca that will be too large causes a BUG().
>
> Ideally we'd be able to check end-to-end that a syscall results in an
> entirely poisoned stack, but I'm not sure how to do a syscall from lkdtm.

I like this. I think it needs some tweaking, notes below...

> Signed-off-by: Tycho Andersen <tycho@docker.com>
> ---
>  drivers/misc/Makefile          |  1 +
>  drivers/misc/lkdtm.h           |  3 ++
>  drivers/misc/lkdtm_core.c      |  1 +
>  drivers/misc/lkdtm_stackleak.c | 79 ++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 84 insertions(+)
>
> diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
> index 81ef3e67acc9..805e4f06011a 100644
> --- a/drivers/misc/Makefile
> +++ b/drivers/misc/Makefile
> @@ -61,6 +61,7 @@ lkdtm-$(CONFIG_LKDTM)         += lkdtm_heap.o
>  lkdtm-$(CONFIG_LKDTM)          += lkdtm_perms.o
>  lkdtm-$(CONFIG_LKDTM)          += lkdtm_rodata_objcopy.o
>  lkdtm-$(CONFIG_LKDTM)          += lkdtm_usercopy.o
> +lkdtm-$(CONFIG_LKDTM)          += lkdtm_stackleak.o
>
>  KCOV_INSTRUMENT_lkdtm_rodata.o := n
>
> diff --git a/drivers/misc/lkdtm.h b/drivers/misc/lkdtm.h
> index 3b4976396ec4..f497c3df1d44 100644
> --- a/drivers/misc/lkdtm.h
> +++ b/drivers/misc/lkdtm.h
> @@ -64,4 +64,7 @@ void lkdtm_USERCOPY_STACK_FRAME_FROM(void);
>  void lkdtm_USERCOPY_STACK_BEYOND(void);
>  void lkdtm_USERCOPY_KERNEL(void);
>
> +/* lkdtm_stackleak.c */
> +void lkdtm_CHECK_STACKLEAK(void);
> +
>  #endif
> diff --git a/drivers/misc/lkdtm_core.c b/drivers/misc/lkdtm_core.c
> index 42d2b8e31e6b..0808bf1b37a8 100644
> --- a/drivers/misc/lkdtm_core.c
> +++ b/drivers/misc/lkdtm_core.c
> @@ -235,6 +235,7 @@ struct crashtype crashtypes[] = {
>         CRASHTYPE(USERCOPY_STACK_FRAME_FROM),
>         CRASHTYPE(USERCOPY_STACK_BEYOND),
>         CRASHTYPE(USERCOPY_KERNEL),
> +       CRASHTYPE(CHECK_STACKLEAK),
>  };
>
>
> diff --git a/drivers/misc/lkdtm_stackleak.c b/drivers/misc/lkdtm_stackleak.c
> new file mode 100644
> index 000000000000..6c343be488db
> --- /dev/null
> +++ b/drivers/misc/lkdtm_stackleak.c
> @@ -0,0 +1,79 @@

This file should have a comment describing its purpose, etc.

> +#include "lkdtm.h"
> +
> +#include <linux/sched.h>
> +
> +static bool check_poison(unsigned long *ptr, unsigned long n)
> +{
> +       unsigned long i;
> +
> +       for (i = 1; i < n; i++) {
> +               pr_info("%lu %p: %lx\n", i, ptr-i, *(ptr - i));
> +               if (*(ptr - i) != -0xbeefL)

We really need a macro for -0xBEEFL for sure now :P

> +                       return false;

Maybe only print out the values if they're a mismatch?

> +       }
> +
> +       return true;
> +}
> +
> +static bool check_my_stack(void)
> +{
> +       char *lowest;
> +       unsigned long i, left, check;
> +
> +       lowest = (char *) (&i + 1);
> +       if (current->thread.lowest_stack < (unsigned long) lowest)
> +               lowest = (char *) current->thread.lowest_stack;
> +
> +       left = ((unsigned long) lowest) % THREAD_SIZE;
> +
> +       /* See note in arch/x86/entry/entry_64.S about the or. */

I wasn't able to find this note. Which did you mean?

> +       left = left - 2 * sizeof(unsigned long);

Can we make this not arch-specific? (I realize the stackleak code is
x86 only at the moment.)

> +
> +       for (i = 0; i < left; i++) {
> +               unsigned long *cur = (void *) lowest - i;
> +
> +               if (*cur == -0xbeefL &&
> +                               (left - i < 16 || check_poison(cur, 16)))
> +                       break;
> +       }
> +
> +       if ((left - i) % sizeof(unsigned long))
> +               pr_warn("found unaligned stack poison?\n");
> +
> +       check = (left - i) / sizeof(unsigned long);
> +       if (check_poison((unsigned long *) (lowest - i), check))
> +               pr_info("current stack poisoned correctly\n");
> +       else
> +               pr_err("current stack not poisoned correctly\n");
> +
> +       return true;
> +}
> +
> +static noinline bool do_alloca(unsigned long size)
> +{
> +       char buf[size];
> +
> +       /* so this doesn't get inlined or optimized out */
> +       snprintf(buf, size, "hello world\n");
> +       return true;
> +}
> +
> +static void big_alloca(void)
> +{
> +       char base;
> +       unsigned long left;
> +
> +       left = ((unsigned long) &base) % THREAD_SIZE;
> +

Right here I'd add a small alloca just to check both working and
non-working cases.

> +       pr_info("attempting large alloca of %lu\n", left);
> +       do_alloca(left);
> +       pr_warn("alloca succeded?\n");
> +}
> +
> +void lkdtm_CHECK_STACKLEAK(void)
> +{
> +       if (!check_my_stack())
> +               return;

This function only ever returns true?

> +
> +       big_alloca();
> +}
> --
> 2.11.0
>

Cool, this would be a nice addition to the stackleak plugin series!

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls
  2017-06-29 21:33     ` Kees Cook
@ 2017-06-29 22:13       ` Tycho Andersen
  0 siblings, 0 replies; 30+ messages in thread
From: Tycho Andersen @ 2017-06-29 22:13 UTC (permalink / raw)
  To: Kees Cook; +Cc: Alexander Popov, kernel-hardening, PaX Team, Brad Spengler

On Thu, Jun 29, 2017 at 02:33:20PM -0700, Kees Cook wrote:
> > One option is to write a kernel module that exposes some device that
> > we could do an ioctl(fd, IO_CHECK_STACK_POISON, pid) or something to
> > check it, but it's not clear how to fit this into the kernel's current
> > testing infrastructure.
> 
> Hmm, so one pid would do deep syscall, return and then spin in
> userspace so its stack could be examined by another thread? I'm
> lacking imagination about how to wire that kind of thing up, though,
> as you say.

Yeah, that's the idea.

> > From 1a5013cdc8f1520a0b220fe92047817a68e0be21 Mon Sep 17 00:00:00 2001
> > From: Tycho Andersen <tycho@docker.com>
> > Date: Thu, 8 Jun 2017 12:43:07 -0600
> > Subject: [PATCH] lkdtm: add a test for STACKLEAK plugin
> >
> > This test does two things: it checks that the current syscall's stack (i.e.
> > the call that's loading the module) is poisoned correctly and then checks
> > that an alloca that will be too large causes a BUG().
> >
> > Ideally we'd be able to check end-to-end that a syscall results in an
> > entirely poisoned stack, but I'm not sure how to do a syscall from lkdtm.
> 
> I like this. I think it needs some tweaking, notes below...

Thanks for taking a look. I'll take the review and roll it into
another version. FWIW, the poison test seems kind of racy for me right
now (sometimes the stack poison is wrong), and I'm still trying to
figure out why.

Cheers,

Tycho

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [kernel-hardening] [RFC][PATCH 0/2] draft of stack clearing for arm64
  2017-06-21 15:54           ` Laura Abbott
@ 2017-07-10 22:04             ` Laura Abbott
  2017-07-10 22:04               ` [kernel-hardening] [RFC][PATCH 1/2] stackleak: Update " Laura Abbott
                                 ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Laura Abbott @ 2017-07-10 22:04 UTC (permalink / raw)
  To: Kees Cook, Alex Popov
  Cc: Laura Abbott, kernel-hardening, Mark Rutland, Ard Biesheuvel

I made an attempt at implementing stack clearing for arm64 using roughly
the same algorithm as x86. It passes some level of basic tests but it definitely
needs more careful review and thought ("submit early and often").

As an added follow up, self-protection.rst should also be updated with some
details about how stackleak actually works for people who want to follow on
for other arches.

Laura Abbott (2):
  stackleak: Update for arm64
  arm64: Clear the stack

 arch/arm64/Kconfig                     |  1 +
 arch/arm64/include/asm/processor.h     |  3 ++
 arch/arm64/kernel/asm-offsets.c        |  3 ++
 arch/arm64/kernel/entry.S              | 92 ++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/process.c            | 18 +++++++
 drivers/firmware/efi/libstub/Makefile  |  3 +-
 scripts/Makefile.gcc-plugins           |  5 +-
 scripts/gcc-plugins/stackleak_plugin.c | 25 +++++++--
 8 files changed, 143 insertions(+), 7 deletions(-)

-- 
2.7.5

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [kernel-hardening] [RFC][PATCH 1/2] stackleak: Update for arm64
  2017-07-10 22:04             ` [kernel-hardening] [RFC][PATCH 0/2] draft of stack clearing for arm64 Laura Abbott
@ 2017-07-10 22:04               ` Laura Abbott
  2017-07-10 22:04               ` [kernel-hardening] [RFC][PATCH 2/2] arm64: Clear the stack Laura Abbott
  2017-07-11 22:56               ` [kernel-hardening] Re: [RFC][PATCH 0/2] draft of stack clearing for arm64 Alexander Popov
  2 siblings, 0 replies; 30+ messages in thread
From: Laura Abbott @ 2017-07-10 22:04 UTC (permalink / raw)
  To: Kees Cook, Alex Popov
  Cc: Laura Abbott, kernel-hardening, Mark Rutland, Ard Biesheuvel

- The arm64 rtl has another layer of indirection in the RTL
- Drop x86 check model check
- Add disable option

Signed-off-by: Laura Abbott <labbott@redhat.com>
---
This can potentially be folded into the existing patch set if there is
agreement.

I still strongly dislike the disable flag approach but it might be cleaner than
the approach for e.g. KASAN. Each plugin has a unique name and arguments so
there ends up being an explosion of items that need to be filtered out. I'm
also not an expert in Makefiles so it might actually be easier than I'm making
it out to be.
---
 scripts/gcc-plugins/stackleak_plugin.c | 25 ++++++++++++++++++++-----
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/scripts/gcc-plugins/stackleak_plugin.c b/scripts/gcc-plugins/stackleak_plugin.c
index 2ee49c4..0cf4c5d 100644
--- a/scripts/gcc-plugins/stackleak_plugin.c
+++ b/scripts/gcc-plugins/stackleak_plugin.c
@@ -176,8 +176,14 @@ static unsigned int stackleak_final_execute(void)
 		if (!CALL_P(insn))
 			continue;
 		body = PATTERN(insn);
+		/* arm64 is different */
+		if (GET_CODE(body) == PARALLEL) {
+			body = XEXP(body, 0);
+			body = XEXP(body, 0);
+		}
 		if (GET_CODE(body) != CALL)
 			continue;
+
 		body = XEXP(body, 0);
 		if (GET_CODE(body) != MEM)
 			continue;
@@ -187,7 +193,7 @@ static unsigned int stackleak_final_execute(void)
 //		if (strcmp(XSTR(body, 0), track_function))
 		if (SYMBOL_REF_DECL(body) != track_function_decl)
 			continue;
-//		warning(0, "track_frame_size: %d %ld %d", cfun->calls_alloca, get_frame_size(), track_frame_size);
+		//warning(0, "track_frame_size: %d %ld %d", cfun->calls_alloca, get_frame_size(), track_frame_size);
 		// 2. delete call
 		delete_insn_and_edges(insn);
 #if BUILDING_GCC_VERSION >= 4007
@@ -210,8 +216,10 @@ static bool stackleak_track_stack_gate(void)
 {
 	tree section;
 
+#if 0
 	if (ix86_cmodel != CM_KERNEL)
 		return false;
+#endif
 
 	section = lookup_attribute("section", DECL_ATTRIBUTES(current_function_decl));
 	if (section && TREE_VALUE(section)) {
@@ -277,6 +285,7 @@ static bool stackleak_final_gate(void)
 
 __visible int plugin_init(struct plugin_name_args *plugin_info, struct plugin_gcc_version *version)
 {
+	bool enabled = true;
 	const char * const plugin_name = plugin_info->base_name;
 	const int argc = plugin_info->argc;
 	const struct plugin_argument * const argv = plugin_info->argv;
@@ -330,13 +339,19 @@ __visible int plugin_init(struct plugin_name_args *plugin_info, struct plugin_gc
 			init_locals = true;
 			continue;
 		}
+		if (!(strcmp(argv[i].key, "disable"))) {
+			enabled = false;
+			continue;
+		}
 		error(G_("unknown option '-fplugin-arg-%s-%s'"), plugin_name, argv[i].key);
 	}
 
-	register_callback(plugin_name, PLUGIN_START_UNIT, &stackleak_start_unit, NULL);
-	register_callback(plugin_name, PLUGIN_REGISTER_GGC_ROOTS, NULL, (void *)&gt_ggc_r_gt_stackleak);
-	register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL, &stackleak_tree_instrument_pass_info);
-	register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL, &stackleak_final_pass_info);
+	if (enabled) {
+		register_callback(plugin_name, PLUGIN_START_UNIT, &stackleak_start_unit, NULL);
+		register_callback(plugin_name, PLUGIN_REGISTER_GGC_ROOTS, NULL, (void *)&gt_ggc_r_gt_stackleak);
+		register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL, &stackleak_tree_instrument_pass_info);
+		register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL, &stackleak_final_pass_info);
+	}
 
 	return 0;
 }
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [kernel-hardening] [RFC][PATCH 2/2] arm64: Clear the stack
  2017-07-10 22:04             ` [kernel-hardening] [RFC][PATCH 0/2] draft of stack clearing for arm64 Laura Abbott
  2017-07-10 22:04               ` [kernel-hardening] [RFC][PATCH 1/2] stackleak: Update " Laura Abbott
@ 2017-07-10 22:04               ` Laura Abbott
  2017-07-11 19:51                 ` [kernel-hardening] " Mark Rutland
  2017-07-11 22:56               ` [kernel-hardening] Re: [RFC][PATCH 0/2] draft of stack clearing for arm64 Alexander Popov
  2 siblings, 1 reply; 30+ messages in thread
From: Laura Abbott @ 2017-07-10 22:04 UTC (permalink / raw)
  To: Kees Cook, Alex Popov
  Cc: Laura Abbott, kernel-hardening, Mark Rutland, Ard Biesheuvel

Implementation of stackleak based heavily on the x86 version

Signed-off-by: Laura Abbott <labbott@redhat.com>
---
The biggest points that need review here:
- Is the extra offsetting (subtracting/adding from the stack) correct?
- Where else do we need to clear the stack?
- The assembly can almost certainly be optimized. I tried to keep the
  behavior of the x86 'repe scasq' and the like where possible. I'm also a
  terrible register allocator.
---
 arch/arm64/Kconfig                    |  1 +
 arch/arm64/include/asm/processor.h    |  3 ++
 arch/arm64/kernel/asm-offsets.c       |  3 ++
 arch/arm64/kernel/entry.S             | 92 +++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/process.c           | 18 +++++++
 drivers/firmware/efi/libstub/Makefile |  3 +-
 scripts/Makefile.gcc-plugins          |  5 +-
 7 files changed, 123 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 8addb85..0b65bfc 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -17,6 +17,7 @@ config ARM64
 	select ARCH_HAS_KCOV
 	select ARCH_HAS_SET_MEMORY
 	select ARCH_HAS_SG_CHAIN
+	select ARCH_HAS_STACKLEAK
 	select ARCH_HAS_STRICT_KERNEL_RWX
 	select ARCH_HAS_STRICT_MODULE_RWX
 	select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 64c9e78..76f2738 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -88,6 +88,9 @@ struct thread_struct {
 	unsigned long		fault_address;	/* fault info */
 	unsigned long		fault_code;	/* ESR_EL1 value */
 	struct debug_info	debug;		/* debugging */
+#ifdef CONFIG_STACKLEAK
+	unsigned long           lowest_stack;
+#endif
 };
 
 #ifdef CONFIG_COMPAT
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index b3bb7ef..e0a5ae2 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -43,6 +43,9 @@ int main(void)
   DEFINE(TSK_TI_TTBR0,		offsetof(struct task_struct, thread_info.ttbr0));
 #endif
   DEFINE(TSK_STACK,		offsetof(struct task_struct, stack));
+#ifdef CONFIG_STACKLEAK
+  DEFINE(TSK_TI_LOWEST_STACK,	offsetof(struct task_struct, thread.lowest_stack));
+#endif
   BLANK();
   DEFINE(THREAD_CPU_CONTEXT,	offsetof(struct task_struct, thread.cpu_context));
   BLANK();
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index b738880..e573633 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -744,6 +744,7 @@ ENDPROC(cpu_switch_to)
  */
 ret_fast_syscall:
 	disable_irq				// disable interrupts
+	bl	erase_kstack
 	str	x0, [sp, #S_X0]			// returned x0
 	ldr	x1, [tsk, #TSK_TI_FLAGS]	// re-check for syscall tracing
 	and	x2, x1, #_TIF_SYSCALL_WORK
@@ -772,6 +773,7 @@ work_pending:
  */
 ret_to_user:
 	disable_irq				// disable interrupts
+	bl	erase_kstack
 	ldr	x1, [tsk, #TSK_TI_FLAGS]
 	and	x2, x1, #_TIF_WORK_MASK
 	cbnz	x2, work_pending
@@ -865,3 +867,93 @@ ENTRY(sys_rt_sigreturn_wrapper)
 	mov	x0, sp
 	b	sys_rt_sigreturn
 ENDPROC(sys_rt_sigreturn_wrapper)
+
+#ifdef CONFIG_STACKLEAK
+ENTRY(erase_kstack)
+	/* save x0 for the fast path */
+	mov	x10, x0
+
+	get_thread_info	x0
+	ldr	x1, [x0, #TSK_TI_LOWEST_STACK]
+
+	/* get the number of bytes to check for lowest stack */
+	mov	x3, x1
+	and	x3, x3, #THREAD_SIZE - 1
+	lsr	x3, x3, #3
+
+	/* now generate the start of the stack */
+	mov	x4, sp
+	movn	x2, #THREAD_SIZE - 1
+	and	x1, x4, x2
+
+	mov	x2, #-0xBEEF	/* stack poison */
+
+	cmp     x3, #0
+	b.eq    4f /* Found nothing, go poison */
+
+1:
+	ldr     x4, [x1, x3, lsl #3]
+	cmp     x4, x2  /* did we find the poison? */
+	b.eq    2f /* yes we did, go check it */
+
+	sub     x3, x3, #1
+	cmp     x3, #0
+	b.eq    4f /* Found nothing, go poison */
+	b       1b /* loop again */
+
+2:
+	cmp     x3, #16
+	b.ls    4f /* Go poison if there are less than 16 things left? */
+
+	mov     x3, #16
+3:
+	ldr     x4, [x1, x3, lsl #3]
+	cmp     x4, x2  /* did we find the poison? */
+	b.ne    1b /* nope we have to check deeper */
+
+	sub     x3, x3, #1
+	cmp     x3, #0
+	b.eq    4f /* Found nothing, go poison */
+	b       3b /* loop again */
+
+	/* The poison function */
+4:
+	/* Generate the address we found */
+	add     x5, x1, x3, lsl #3
+	orr     x5, x5, #16
+
+	mov	x4, sp
+	/* subtrace the current pointer */
+	sub     x8, x4, x5
+
+	/* subtract one more so we don't touch the current sp */
+	sub	x8, x8, #1
+
+	/* sanity check */
+	cmp     x8, #THREAD_SIZE
+	b.lo    5f
+999:
+	b       999b
+
+5:
+	lsr     x8, x8, #3
+	mov     x3, x8
+6:
+	cmp     x3, #0
+	b.eq    7f
+
+	str     x2, [x1, x3, lsl #3]
+	sub     x3, x3, #1
+	b	6b
+
+	/* Reset the lowest stack to the top of the stack */
+7:
+	ldr	x1, [x0, TSK_STACK]
+	add	x1, x1, #THREAD_SIZE
+	sub	x1, x1, #256
+	str	x1, [x0, #TSK_TI_LOWEST_STACK]
+
+	mov	x0, x10
+	ret
+ENDPROC(erase_kstack)
+#endif
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 659ae80..1b6cca2 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -293,6 +293,12 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
 	p->thread.cpu_context.pc = (unsigned long)ret_from_fork;
 	p->thread.cpu_context.sp = (unsigned long)childregs;
 
+#ifdef CONFIG_STACKLEAK
+	p->thread.lowest_stack = (unsigned long)task_stack_page(p) +
+					2 * sizeof(unsigned long);
+#endif
+
+
 	ptrace_hw_copy_thread(p);
 
 	return 0;
@@ -417,3 +423,15 @@ unsigned long arch_randomize_brk(struct mm_struct *mm)
 	else
 		return randomize_page(mm->brk, SZ_1G);
 }
+
+#ifdef CONFIG_STACKLEAK
+void __used check_alloca(unsigned long size)
+{
+	unsigned long sp = (unsigned long)&sp, stack_left;
+
+	/* all kernel stacks are of the same size */
+	stack_left = sp & (THREAD_SIZE - 1);
+	BUG_ON(stack_left < 256 || size >= stack_left - 256);
+}
+EXPORT_SYMBOL(check_alloca);
+#endif
diff --git a/drivers/firmware/efi/libstub/Makefile b/drivers/firmware/efi/libstub/Makefile
index f742596..cb378fa 100644
--- a/drivers/firmware/efi/libstub/Makefile
+++ b/drivers/firmware/efi/libstub/Makefile
@@ -18,7 +18,8 @@ cflags-$(CONFIG_EFI_ARMSTUB)	+= -I$(srctree)/scripts/dtc/libfdt
 
 KBUILD_CFLAGS			:= $(cflags-y) -DDISABLE_BRANCH_PROFILING \
 				   $(call cc-option,-ffreestanding) \
-				   $(call cc-option,-fno-stack-protector)
+				   $(call cc-option,-fno-stack-protector) \
+				   $(DISABLE_STACKLEAK_PLUGIN)
 
 GCOV_PROFILE			:= n
 KASAN_SANITIZE			:= n
diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins
index 5c86f64..0eb8dbc 100644
--- a/scripts/Makefile.gcc-plugins
+++ b/scripts/Makefile.gcc-plugins
@@ -35,11 +35,14 @@ ifdef CONFIG_GCC_PLUGINS
 
   gcc-plugin-$(CONFIG_STACKLEAK)	+= stackleak_plugin.so
   gcc-plugin-cflags-$(CONFIG_STACKLEAK)	+= -DSTACKLEAK_PLUGIN -fplugin-arg-stackleak_plugin-track-lowest-sp=100
+  ifdef CONFIG_STACKLEAK
+    DISABLE_STACKLEAK_PLUGIN		+= -fplugin-arg-stackleak_plugin-disable
+  endif
 
   GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y))
 
   export PLUGINCC GCC_PLUGINS_CFLAGS GCC_PLUGIN GCC_PLUGIN_SUBDIR
-  export SANCOV_PLUGIN DISABLE_LATENT_ENTROPY_PLUGIN
+  export SANCOV_PLUGIN DISABLE_LATENT_ENTROPY_PLUGIN DISABLE_STACKLEAK_PLUGIN
 
   ifneq ($(PLUGINCC),)
     # SANCOV_PLUGIN can be only in CFLAGS_KCOV because avoid duplication.
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [kernel-hardening] Re: [RFC][PATCH 2/2] arm64: Clear the stack
  2017-07-10 22:04               ` [kernel-hardening] [RFC][PATCH 2/2] arm64: Clear the stack Laura Abbott
@ 2017-07-11 19:51                 ` Mark Rutland
  2017-07-11 20:04                   ` Mark Rutland
                                     ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Mark Rutland @ 2017-07-11 19:51 UTC (permalink / raw)
  To: Laura Abbott; +Cc: Kees Cook, Alex Popov, kernel-hardening, Ard Biesheuvel

Hi Laura,

On Mon, Jul 10, 2017 at 03:04:43PM -0700, Laura Abbott wrote:
> Implementation of stackleak based heavily on the x86 version

Nice!

> Signed-off-by: Laura Abbott <labbott@redhat.com>
> ---
> The biggest points that need review here:
> - Is the extra offsetting (subtracting/adding from the stack) correct?

I think it's a little off; more on that below.

> - Where else do we need to clear the stack?

I guess we might need to clear (all of the remainder of) the stack after
invoking EFI runtime services -- those can run in task context, might
leave sensitive values on the stack, and they're uninstrumented. The
same would apply for x86.

I think we can ignore garbage left on the stack by idle/hotplug, since
that happens in the idle thread, so we shouldn't be doing uaccess
transfers on those stacks.

> - The assembly can almost certainly be optimized. I tried to keep the
>   behavior of the x86 'repe scasq' and the like where possible. I'm also a
>   terrible register allocator.

I tried to steer clear of code golf here, but I didn't entirely manage.

I also don't know x86 asm, so it's very possible I've just managed to
confuse myself.

> ---
>  arch/arm64/Kconfig                    |  1 +
>  arch/arm64/include/asm/processor.h    |  3 ++
>  arch/arm64/kernel/asm-offsets.c       |  3 ++
>  arch/arm64/kernel/entry.S             | 92 +++++++++++++++++++++++++++++++++++
>  arch/arm64/kernel/process.c           | 18 +++++++
>  drivers/firmware/efi/libstub/Makefile |  3 +-
>  scripts/Makefile.gcc-plugins          |  5 +-
>  7 files changed, 123 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 8addb85..0b65bfc 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -17,6 +17,7 @@ config ARM64
>  	select ARCH_HAS_KCOV
>  	select ARCH_HAS_SET_MEMORY
>  	select ARCH_HAS_SG_CHAIN
> +	select ARCH_HAS_STACKLEAK
>  	select ARCH_HAS_STRICT_KERNEL_RWX
>  	select ARCH_HAS_STRICT_MODULE_RWX
>  	select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
> diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> index 64c9e78..76f2738 100644
> --- a/arch/arm64/include/asm/processor.h
> +++ b/arch/arm64/include/asm/processor.h
> @@ -88,6 +88,9 @@ struct thread_struct {
>  	unsigned long		fault_address;	/* fault info */
>  	unsigned long		fault_code;	/* ESR_EL1 value */
>  	struct debug_info	debug;		/* debugging */
> +#ifdef CONFIG_STACKLEAK
> +	unsigned long           lowest_stack;
> +#endif
>  };
>  
>  #ifdef CONFIG_COMPAT
> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
> index b3bb7ef..e0a5ae2 100644
> --- a/arch/arm64/kernel/asm-offsets.c
> +++ b/arch/arm64/kernel/asm-offsets.c
> @@ -43,6 +43,9 @@ int main(void)
>    DEFINE(TSK_TI_TTBR0,		offsetof(struct task_struct, thread_info.ttbr0));
>  #endif
>    DEFINE(TSK_STACK,		offsetof(struct task_struct, stack));
> +#ifdef CONFIG_STACKLEAK
> +  DEFINE(TSK_TI_LOWEST_STACK,	offsetof(struct task_struct, thread.lowest_stack));
> +#endif
>    BLANK();
>    DEFINE(THREAD_CPU_CONTEXT,	offsetof(struct task_struct, thread.cpu_context));
>    BLANK();
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index b738880..e573633 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -744,6 +744,7 @@ ENDPROC(cpu_switch_to)
>   */
>  ret_fast_syscall:
>  	disable_irq				// disable interrupts
> +	bl	erase_kstack
>  	str	x0, [sp, #S_X0]			// returned x0
>  	ldr	x1, [tsk, #TSK_TI_FLAGS]	// re-check for syscall tracing
>  	and	x2, x1, #_TIF_SYSCALL_WORK
> @@ -772,6 +773,7 @@ work_pending:
>   */
>  ret_to_user:
>  	disable_irq				// disable interrupts
> +	bl	erase_kstack
>  	ldr	x1, [tsk, #TSK_TI_FLAGS]
>  	and	x2, x1, #_TIF_WORK_MASK
>  	cbnz	x2, work_pending
> @@ -865,3 +867,93 @@ ENTRY(sys_rt_sigreturn_wrapper)
>  	mov	x0, sp
>  	b	sys_rt_sigreturn
>  ENDPROC(sys_rt_sigreturn_wrapper)
> +
> +#ifdef CONFIG_STACKLEAK
> +ENTRY(erase_kstack)
> +	/* save x0 for the fast path */
> +	mov	x10, x0
> +
> +	get_thread_info	x0
> +	ldr	x1, [x0, #TSK_TI_LOWEST_STACK]
> +
> +	/* get the number of bytes to check for lowest stack */
> +	mov	x3, x1
> +	and	x3, x3, #THREAD_SIZE - 1
> +	lsr	x3, x3, #3
> +
> +	/* now generate the start of the stack */
> +	mov	x4, sp
> +	movn	x2, #THREAD_SIZE - 1
> +	and	x1, x4, x2
> +
> +	mov	x2, #-0xBEEF	/* stack poison */
> +
> +	cmp     x3, #0
> +	b.eq    4f /* Found nothing, go poison */
> +
> +1:
> +	ldr     x4, [x1, x3, lsl #3]
> +	cmp     x4, x2  /* did we find the poison? */
> +	b.eq    2f /* yes we did, go check it */
> +
> +	sub     x3, x3, #1
> +	cmp     x3, #0
> +	b.eq    4f /* Found nothing, go poison */
> +	b       1b /* loop again */
> +
> +2:

IIUC, at this point, x1 was the start (lowest addr) of the stack, and x3
was the quadword index of the first poison on that stack.

The x86 asm implicitly held that [x1, x3, lsl #3] address in RDI, but we
don't have a copy...

> +	cmp     x3, #16
> +	b.ls    4f /* Go poison if there are less than 16 things left? */
> +
> +	mov     x3, #16

... and here we blat the index without saving it, meaning we always jump
to close to the start of the stack, which I don't think was intentional.

So then we fall though to the below...

> +3:
> +	ldr     x4, [x1, x3, lsl #3]
> +	cmp     x4, x2  /* did we find the poison? */
> +	b.ne    1b /* nope we have to check deeper */
> +
> +	sub     x3, x3, #1
> +	cmp     x3, #0
> +	b.eq    4f /* Found nothing, go poison */
> +	b       3b /* loop again */

... where we either:

* find 16 contiguous poison entries, leaving x3 == 0, or:

* we immediately find a non-poison value, and jump back to 1b. If there
  are 16 non-poison values, we're left with x3 == 0, otherwise we bail
  out and jump to 4f with x3 in the interval [0,15].

.... or I've completely confused myself, as suggested above.

We might not have x86's fancy string instructions, but we could do
something equally impenetrable:

	mov	x5, #0
1:
	cbz	x3, 4f
	ldr	x4, [x1, x3, lsl #3]
	cmp	x4, x2
	csinc	x5, xzr, x5, ne
	tbz	x5, #4, 4f		// found 16 poisons?
	sub	x3, x3, #1
	b	1b

That doesn't stop 16 slots early, so it can return any value of x3 all
the way down to zero.

> +
> +	/* The poison function */
> +4:
> +	/* Generate the address we found */
> +	add     x5, x1, x3, lsl #3
> +	orr     x5, x5, #16

Have you figured out what the forced 16 byte offset is for?

I've not managed to figure out why it's there, and it looks like
Alexander couldn't either, judging by his comments in the x86 asm.

> +
> +	mov	x4, sp
> +	/* subtrace the current pointer */
> +	sub     x8, x4, x5
> +
> +	/* subtract one more so we don't touch the current sp */
> +	sub	x8, x8, #1
> +
> +	/* sanity check */
> +	cmp     x8, #THREAD_SIZE
> +	b.lo    5f
> +999:
> +	b       999b
> +
> +5:
> +	lsr     x8, x8, #3
> +	mov     x3, x8
> +6:
> +	cmp     x3, #0
> +	b.eq    7f
> +
> +	str     x2, [x1, x3, lsl #3]
> +	sub     x3, x3, #1
> +	b	6b
> +
> +	/* Reset the lowest stack to the top of the stack */
> +7:
> +	ldr	x1, [x0, TSK_STACK]
> +	add	x1, x1, #THREAD_SIZE
> +	sub	x1, x1, #256
> +	str	x1, [x0, #TSK_TI_LOWEST_STACK]

I take it this is the offsetting you were querying?

I don't think it's quite right. Our stack looks like:

+---+ <- task_stack_page(p) + THREAD_SIZE
|   |
+---+ <- task_stack_page(p) + THREAD_START_SP
|   |
|   |
+---+ <- task_pt_regs(p)
|   |
|   |
|   |
~~~~~

~~~~~
|   |
|   |
|   |
+---+ <- task_stack_page(p)

At the point we return to userspace, sp == task_pt_regs(p).

Judging by a generated asm-offsets.h, sizeof(struct_pt_regs) is 304
bytes currently. THREAD_SIZE - THREAD_START_SP == 16.

We probably want to give that 16 a mnemonic (e.g FRAME_PADDING), and
have something like:

	ldr     x1, [x0, TSK_STACK]
	add	x1, x1, #THREAD_SIZE
	sub	x1, x1, #(S_FRAME_SIZE + FRAME_PADDING)
	str	x1, [x0, #TSK_TI_LOWEST_STACK]

> +
> +	mov	x0, x10
> +	ret
> +ENDPROC(erase_kstack)
> +#endif
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 659ae80..1b6cca2 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -293,6 +293,12 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
>  	p->thread.cpu_context.pc = (unsigned long)ret_from_fork;
>  	p->thread.cpu_context.sp = (unsigned long)childregs;
>  
> +#ifdef CONFIG_STACKLEAK
> +	p->thread.lowest_stack = (unsigned long)task_stack_page(p) +
> +					2 * sizeof(unsigned long);
> +#endif

I see this follows the x86 logic, but I don't see why it's necessary to
offset this end of the stack.

Do you have an idea as to what this is for?

> +
> +
>  	ptrace_hw_copy_thread(p);
>  
>  	return 0;
> @@ -417,3 +423,15 @@ unsigned long arch_randomize_brk(struct mm_struct *mm)
>  	else
>  		return randomize_page(mm->brk, SZ_1G);
>  }
> +
> +#ifdef CONFIG_STACKLEAK
> +void __used check_alloca(unsigned long size)
> +{
> +	unsigned long sp = (unsigned long)&sp, stack_left;

Nit: please use current_stack_pointer.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [kernel-hardening] Re: [RFC][PATCH 2/2] arm64: Clear the stack
  2017-07-11 19:51                 ` [kernel-hardening] " Mark Rutland
@ 2017-07-11 20:04                   ` Mark Rutland
  2017-07-12  6:01                   ` Alexander Popov
  2017-07-14 20:51                   ` Laura Abbott
  2 siblings, 0 replies; 30+ messages in thread
From: Mark Rutland @ 2017-07-11 20:04 UTC (permalink / raw)
  To: Laura Abbott; +Cc: Kees Cook, Alex Popov, kernel-hardening, Ard Biesheuvel

On Tue, Jul 11, 2017 at 08:51:55PM +0100, Mark Rutland wrote:
> On Mon, Jul 10, 2017 at 03:04:43PM -0700, Laura Abbott wrote:
> > +	/* Reset the lowest stack to the top of the stack */
> > +7:
> > +	ldr	x1, [x0, TSK_STACK]
> > +	add	x1, x1, #THREAD_SIZE
> > +	sub	x1, x1, #256
> > +	str	x1, [x0, #TSK_TI_LOWEST_STACK]
> 
> I take it this is the offsetting you were querying?
> 
> I don't think it's quite right. Our stack looks like:
> 
> +---+ <- task_stack_page(p) + THREAD_SIZE
> |   |
> +---+ <- task_stack_page(p) + THREAD_START_SP
> |   |
> |   |
> +---+ <- task_pt_regs(p)
> |   |
> |   |
> |   |
> ~~~~~
> 
> ~~~~~
> |   |
> |   |
> |   |
> +---+ <- task_stack_page(p)
> 
> At the point we return to userspace, sp == task_pt_regs(p).
> 
> Judging by a generated asm-offsets.h, sizeof(struct_pt_regs) is 304
> bytes currently. THREAD_SIZE - THREAD_START_SP == 16.
> 
> We probably want to give that 16 a mnemonic (e.g FRAME_PADDING), and
> have something like:
> 
> 	ldr     x1, [x0, TSK_STACK]
> 	add	x1, x1, #THREAD_SIZE
> 	sub	x1, x1, #(S_FRAME_SIZE + FRAME_PADDING)
> 	str	x1, [x0, #TSK_TI_LOWEST_STACK]

Thinking about it, given that sp == task_pt_regs(p), we could just do:

	mov	x1, sp
	str     x1, [x0, #TSK_TI_LOWEST_STACK]

... unless I've managed to lose the plot here.

Mark.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [kernel-hardening] Re: [RFC][PATCH 0/2] draft of stack clearing for arm64
  2017-07-10 22:04             ` [kernel-hardening] [RFC][PATCH 0/2] draft of stack clearing for arm64 Laura Abbott
  2017-07-10 22:04               ` [kernel-hardening] [RFC][PATCH 1/2] stackleak: Update " Laura Abbott
  2017-07-10 22:04               ` [kernel-hardening] [RFC][PATCH 2/2] arm64: Clear the stack Laura Abbott
@ 2017-07-11 22:56               ` Alexander Popov
  2 siblings, 0 replies; 30+ messages in thread
From: Alexander Popov @ 2017-07-11 22:56 UTC (permalink / raw)
  To: Laura Abbott, Kees Cook
  Cc: kernel-hardening, Mark Rutland, Ard Biesheuvel, pageexec,
	Brad Spengler, Tycho Andersen

Hello Laura,

On 11.07.2017 01:04, Laura Abbott wrote:
> I made an attempt at implementing stack clearing for arm64 using roughly
> the same algorithm as x86. It passes some level of basic tests but it definitely
> needs more careful review and thought ("submit early and often").

Thank you for joining that fun!

I will send the third version of the STACKLEAK patch shortly, please see the
changes. I'm going to CC x86@kernel.org. Hope they will have a look at the stack
erasing, and we will gain better understanding.

> As an added follow up, self-protection.rst should also be updated with some
> details about how stackleak actually works for people who want to follow on
> for other arches.

Thanks, I've added this to the TODO.

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [kernel-hardening] Re: [RFC][PATCH 2/2] arm64: Clear the stack
  2017-07-11 19:51                 ` [kernel-hardening] " Mark Rutland
  2017-07-11 20:04                   ` Mark Rutland
@ 2017-07-12  6:01                   ` Alexander Popov
  2017-07-14 20:51                   ` Laura Abbott
  2 siblings, 0 replies; 30+ messages in thread
From: Alexander Popov @ 2017-07-12  6:01 UTC (permalink / raw)
  To: Mark Rutland, Laura Abbott; +Cc: Kees Cook, kernel-hardening, Ard Biesheuvel

Hello Mark,

On 11.07.2017 22:51, Mark Rutland wrote:
> On Mon, Jul 10, 2017 at 03:04:43PM -0700, Laura Abbott wrote:
>> - Where else do we need to clear the stack?
> 
> I guess we might need to clear (all of the remainder of) the stack after
> invoking EFI runtime services -- those can run in task context, might
> leave sensitive values on the stack, and they're uninstrumented. The
> same would apply for x86.

Thanks, I've added this to the TODO list.

> I think we can ignore garbage left on the stack by idle/hotplug, since
> that happens in the idle thread, so we shouldn't be doing uaccess
> transfers on those stacks.

Excuse me, I didn't understand what you mean. erase_kstack() is called at the
end of syscall before returning to the userspace.

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [kernel-hardening] Re: [RFC][PATCH 2/2] arm64: Clear the stack
  2017-07-11 19:51                 ` [kernel-hardening] " Mark Rutland
  2017-07-11 20:04                   ` Mark Rutland
  2017-07-12  6:01                   ` Alexander Popov
@ 2017-07-14 20:51                   ` Laura Abbott
  2017-07-21 16:56                     ` Alexander Popov
  2 siblings, 1 reply; 30+ messages in thread
From: Laura Abbott @ 2017-07-14 20:51 UTC (permalink / raw)
  To: Mark Rutland; +Cc: Kees Cook, Alex Popov, kernel-hardening, Ard Biesheuvel

On 07/11/2017 12:51 PM, Mark Rutland wrote:
> Hi Laura,
> 
> On Mon, Jul 10, 2017 at 03:04:43PM -0700, Laura Abbott wrote:
>> Implementation of stackleak based heavily on the x86 version
> 
> Nice!
> 
>> Signed-off-by: Laura Abbott <labbott@redhat.com>
>> ---
>> The biggest points that need review here:
>> - Is the extra offsetting (subtracting/adding from the stack) correct?
> 
> I think it's a little off; more on that below.
> 
>> - Where else do we need to clear the stack?
> 
> I guess we might need to clear (all of the remainder of) the stack after
> invoking EFI runtime services -- those can run in task context, might
> leave sensitive values on the stack, and they're uninstrumented. The
> same would apply for x86.
> 
> I think we can ignore garbage left on the stack by idle/hotplug, since
> that happens in the idle thread, so we shouldn't be doing uaccess
> transfers on those stacks.
> 
>> - The assembly can almost certainly be optimized. I tried to keep the
>>   behavior of the x86 'repe scasq' and the like where possible. I'm also a
>>   terrible register allocator.
> 
> I tried to steer clear of code golf here, but I didn't entirely manage.
> 
> I also don't know x86 asm, so it's very possible I've just managed to
> confuse myself.
> 

I know enough x86 asm to confuse myself as you can see below

>> ---
>>  arch/arm64/Kconfig                    |  1 +
>>  arch/arm64/include/asm/processor.h    |  3 ++
>>  arch/arm64/kernel/asm-offsets.c       |  3 ++
>>  arch/arm64/kernel/entry.S             | 92 +++++++++++++++++++++++++++++++++++
>>  arch/arm64/kernel/process.c           | 18 +++++++
>>  drivers/firmware/efi/libstub/Makefile |  3 +-
>>  scripts/Makefile.gcc-plugins          |  5 +-
>>  7 files changed, 123 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 8addb85..0b65bfc 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -17,6 +17,7 @@ config ARM64
>>  	select ARCH_HAS_KCOV
>>  	select ARCH_HAS_SET_MEMORY
>>  	select ARCH_HAS_SG_CHAIN
>> +	select ARCH_HAS_STACKLEAK
>>  	select ARCH_HAS_STRICT_KERNEL_RWX
>>  	select ARCH_HAS_STRICT_MODULE_RWX
>>  	select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
>> diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
>> index 64c9e78..76f2738 100644
>> --- a/arch/arm64/include/asm/processor.h
>> +++ b/arch/arm64/include/asm/processor.h
>> @@ -88,6 +88,9 @@ struct thread_struct {
>>  	unsigned long		fault_address;	/* fault info */
>>  	unsigned long		fault_code;	/* ESR_EL1 value */
>>  	struct debug_info	debug;		/* debugging */
>> +#ifdef CONFIG_STACKLEAK
>> +	unsigned long           lowest_stack;
>> +#endif
>>  };
>>  
>>  #ifdef CONFIG_COMPAT
>> diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
>> index b3bb7ef..e0a5ae2 100644
>> --- a/arch/arm64/kernel/asm-offsets.c
>> +++ b/arch/arm64/kernel/asm-offsets.c
>> @@ -43,6 +43,9 @@ int main(void)
>>    DEFINE(TSK_TI_TTBR0,		offsetof(struct task_struct, thread_info.ttbr0));
>>  #endif
>>    DEFINE(TSK_STACK,		offsetof(struct task_struct, stack));
>> +#ifdef CONFIG_STACKLEAK
>> +  DEFINE(TSK_TI_LOWEST_STACK,	offsetof(struct task_struct, thread.lowest_stack));
>> +#endif
>>    BLANK();
>>    DEFINE(THREAD_CPU_CONTEXT,	offsetof(struct task_struct, thread.cpu_context));
>>    BLANK();
>> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
>> index b738880..e573633 100644
>> --- a/arch/arm64/kernel/entry.S
>> +++ b/arch/arm64/kernel/entry.S
>> @@ -744,6 +744,7 @@ ENDPROC(cpu_switch_to)
>>   */
>>  ret_fast_syscall:
>>  	disable_irq				// disable interrupts
>> +	bl	erase_kstack
>>  	str	x0, [sp, #S_X0]			// returned x0
>>  	ldr	x1, [tsk, #TSK_TI_FLAGS]	// re-check for syscall tracing
>>  	and	x2, x1, #_TIF_SYSCALL_WORK
>> @@ -772,6 +773,7 @@ work_pending:
>>   */
>>  ret_to_user:
>>  	disable_irq				// disable interrupts
>> +	bl	erase_kstack
>>  	ldr	x1, [tsk, #TSK_TI_FLAGS]
>>  	and	x2, x1, #_TIF_WORK_MASK
>>  	cbnz	x2, work_pending
>> @@ -865,3 +867,93 @@ ENTRY(sys_rt_sigreturn_wrapper)
>>  	mov	x0, sp
>>  	b	sys_rt_sigreturn
>>  ENDPROC(sys_rt_sigreturn_wrapper)
>> +
>> +#ifdef CONFIG_STACKLEAK
>> +ENTRY(erase_kstack)
>> +	/* save x0 for the fast path */
>> +	mov	x10, x0
>> +
>> +	get_thread_info	x0
>> +	ldr	x1, [x0, #TSK_TI_LOWEST_STACK]
>> +
>> +	/* get the number of bytes to check for lowest stack */
>> +	mov	x3, x1
>> +	and	x3, x3, #THREAD_SIZE - 1
>> +	lsr	x3, x3, #3
>> +
>> +	/* now generate the start of the stack */
>> +	mov	x4, sp
>> +	movn	x2, #THREAD_SIZE - 1
>> +	and	x1, x4, x2
>> +
>> +	mov	x2, #-0xBEEF	/* stack poison */
>> +
>> +	cmp     x3, #0
>> +	b.eq    4f /* Found nothing, go poison */
>> +
>> +1:
>> +	ldr     x4, [x1, x3, lsl #3]
>> +	cmp     x4, x2  /* did we find the poison? */
>> +	b.eq    2f /* yes we did, go check it */
>> +
>> +	sub     x3, x3, #1
>> +	cmp     x3, #0
>> +	b.eq    4f /* Found nothing, go poison */
>> +	b       1b /* loop again */
>> +
>> +2:
> 
> IIUC, at this point, x1 was the start (lowest addr) of the stack, and x3
> was the quadword index of the first poison on that stack.
> 
> The x86 asm implicitly held that [x1, x3, lsl #3] address in RDI, but we
> don't have a copy...
> 
>> +	cmp     x3, #16
>> +	b.ls    4f /* Go poison if there are less than 16 things left? */
>> +
>> +	mov     x3, #16
> 
> ... and here we blat the index without saving it, meaning we always jump
> to close to the start of the stack, which I don't think was intentional.
> 

Urgh. You are right. I had an index register when I first
started out and then a different bug caused me to erroneously
remove it.

> So then we fall though to the below...
> 
>> +3:
>> +	ldr     x4, [x1, x3, lsl #3]
>> +	cmp     x4, x2  /* did we find the poison? */
>> +	b.ne    1b /* nope we have to check deeper */
>> +
>> +	sub     x3, x3, #1
>> +	cmp     x3, #0
>> +	b.eq    4f /* Found nothing, go poison */
>> +	b       3b /* loop again */
> 
> ... where we either:
> 
> * find 16 contiguous poison entries, leaving x3 == 0, or:
> 
> * we immediately find a non-poison value, and jump back to 1b. If there
>   are 16 non-poison values, we're left with x3 == 0, otherwise we bail
>   out and jump to 4f with x3 in the interval [0,15].
> 
> .... or I've completely confused myself, as suggested above.
> 
> We might not have x86's fancy string instructions, but we could do
> something equally impenetrable:
> 
> 	mov	x5, #0
> 1:
> 	cbz	x3, 4f
> 	ldr	x4, [x1, x3, lsl #3]
> 	cmp	x4, x2
> 	csinc	x5, xzr, x5, ne
> 	tbz	x5, #4, 4f		// found 16 poisons?
> 	sub	x3, x3, #1
> 	b	1b
> 
> That doesn't stop 16 slots early, so it can return any value of x3 all
> the way down to zero.
> 

Seems to work for my testing.

>> +
>> +	/* The poison function */
>> +4:
>> +	/* Generate the address we found */
>> +	add     x5, x1, x3, lsl #3
>> +	orr     x5, x5, #16
> 
> Have you figured out what the forced 16 byte offset is for?
> 
> I've not managed to figure out why it's there, and it looks like
> Alexander couldn't either, judging by his comments in the x86 asm.
>

>From get_wchan in arch/x86/kernel/process.c, it might be be to
account for the start of the frame correctly? I don't have a
definitive answer though and plan on just removing this for the
next version.
 
>> >> +	mov	x4, sp
>> +	/* subtrace the current pointer */
>> +	sub     x8, x4, x5
>> +
>> +	/* subtract one more so we don't touch the current sp */
>> +	sub	x8, x8, #1
>> +
>> +	/* sanity check */
>> +	cmp     x8, #THREAD_SIZE
>> +	b.lo    5f
>> +999:
>> +	b       999b
>> +
>> +5:
>> +	lsr     x8, x8, #3
>> +	mov     x3, x8
>> +6:
>> +	cmp     x3, #0
>> +	b.eq    7f
>> +
>> +	str     x2, [x1, x3, lsl #3]
>> +	sub     x3, x3, #1
>> +	b	6b
>> +
>> +	/* Reset the lowest stack to the top of the stack */
>> +7:
>> +	ldr	x1, [x0, TSK_STACK]
>> +	add	x1, x1, #THREAD_SIZE
>> +	sub	x1, x1, #256
>> +	str	x1, [x0, #TSK_TI_LOWEST_STACK]
> 
> I take it this is the offsetting you were querying?
> 
> I don't think it's quite right. Our stack looks like:
> 
> +---+ <- task_stack_page(p) + THREAD_SIZE
> |   |
> +---+ <- task_stack_page(p) + THREAD_START_SP
> |   |
> |   |
> +---+ <- task_pt_regs(p)
> |   |
> |   |
> |   |
> ~~~~~
> 
> ~~~~~
> |   |
> |   |
> |   |
> +---+ <- task_stack_page(p)
> 
> At the point we return to userspace, sp == task_pt_regs(p).
> 
> Judging by a generated asm-offsets.h, sizeof(struct_pt_regs) is 304
> bytes currently. THREAD_SIZE - THREAD_START_SP == 16.
> 
> We probably want to give that 16 a mnemonic (e.g FRAME_PADDING), and
> have something like:
> 
> 	ldr     x1, [x0, TSK_STACK]
> 	add	x1, x1, #THREAD_SIZE
> 	sub	x1, x1, #(S_FRAME_SIZE + FRAME_PADDING)
> 	str	x1, [x0, #TSK_TI_LOWEST_STACK]
> 

Yes, that looks cleaner. I suspect that even though we weren't
subtracting enough, it still worked because the lowest stack
would get overwritten in track_stack later.

>> +
>> +	mov	x0, x10
>> +	ret
>> +ENDPROC(erase_kstack)
>> +#endif
>> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
>> index 659ae80..1b6cca2 100644
>> --- a/arch/arm64/kernel/process.c
>> +++ b/arch/arm64/kernel/process.c
>> @@ -293,6 +293,12 @@ int copy_thread(unsigned long clone_flags, unsigned long stack_start,
>>  	p->thread.cpu_context.pc = (unsigned long)ret_from_fork;
>>  	p->thread.cpu_context.sp = (unsigned long)childregs;
>>  
>> +#ifdef CONFIG_STACKLEAK
>> +	p->thread.lowest_stack = (unsigned long)task_stack_page(p) +
>> +					2 * sizeof(unsigned long);
>> +#endif
> 
> I see this follows the x86 logic, but I don't see why it's necessary to
> offset this end of the stack.
> 
> Do you have an idea as to what this is for?
> 

Again, no and I think I'll again remove it.

>> +
>> +
>>  	ptrace_hw_copy_thread(p);
>>  
>>  	return 0;
>> @@ -417,3 +423,15 @@ unsigned long arch_randomize_brk(struct mm_struct *mm)
>>  	else
>>  		return randomize_page(mm->brk, SZ_1G);
>>  }
>> +
>> +#ifdef CONFIG_STACKLEAK
>> +void __used check_alloca(unsigned long size)
>> +{
>> +	unsigned long sp = (unsigned long)&sp, stack_left;
> 
> Nit: please use current_stack_pointer.
> 

Ack. This should also be cleaned up in the the track_stack
function as well.

> Thanks,
> Mark.
> 

Thanks,
Laura

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [kernel-hardening] Re: [RFC][PATCH 2/2] arm64: Clear the stack
  2017-07-14 20:51                   ` Laura Abbott
@ 2017-07-21 16:56                     ` Alexander Popov
  2017-07-22  0:23                       ` Laura Abbott
  0 siblings, 1 reply; 30+ messages in thread
From: Alexander Popov @ 2017-07-21 16:56 UTC (permalink / raw)
  To: Laura Abbott, Mark Rutland
  Cc: Kees Cook, kernel-hardening, Ard Biesheuvel, Tycho Andersen, PaX Team

Hello Laura and Mark,

[+ Tycho Andersen and Pax Team]

On 14.07.2017 23:51, Laura Abbott wrote:
> On 07/11/2017 12:51 PM, Mark Rutland wrote:
>> On Mon, Jul 10, 2017 at 03:04:43PM -0700, Laura Abbott wrote:
>>> +	/* The poison function */
>>> +4:
>>> +	/* Generate the address we found */
>>> +	add     x5, x1, x3, lsl #3
>>> +	orr     x5, x5, #16
>>
>> Have you figured out what the forced 16 byte offset is for?
>>
>> I've not managed to figure out why it's there, and it looks like
>> Alexander couldn't either, judging by his comments in the x86 asm.
> 
> From get_wchan in arch/x86/kernel/process.c, it might be be to
> account for the start of the frame correctly? I don't have a
> definitive answer though and plan on just removing this for the
> next version.

I've investigated it carefully: we should not remove this 16-byte offset.

I looked at the bottom of the kernel stack with the debugger and found a
non-zero value 0x57AC6E9D. It is STACK_END_MAGIC, which is defined in
include/uapi/linux/magic.h. This value is checked if we enable
CONFIG_SCHED_STACK_END_CHECK.

Then I removed this 16-byte offset in stackleak patch (including the OR
instruction in erase_kstack()) to see how the first erase_kstack() call happily
overwrites that magic value:

[    1.574655] Freeing unused kernel memory: 1244K
[    1.575026] Kernel memory protection disabled.
[    1.578575] Kernel panic - not syncing: corrupted stack end detected inside
scheduler
[    1.578575]
[    1.579420] CPU: 0 PID: 1 Comm: init Not tainted 4.12.0+ #9
[    1.579420] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
[    1.579420] Call Trace:
[    1.579420]  dump_stack+0x63/0x81
[    1.579420]  panic+0xd0/0x212
[    1.579420]  __schedule+0x6d8/0x6e0
[    1.579420]  schedule+0x31/0x80
[    1.579420]  io_schedule+0x11/0x40
[    1.579420]  __lock_page_or_retry+0x17d/0x2b0
[    1.579420]  ? page_cache_tree_insert+0x90/0x90
[    1.579420]  filemap_fault+0x3aa/0x5c0
[    1.579420]  ? filemap_map_pages+0x1a6/0x280
[    1.579420]  ext4_filemap_fault+0x2d/0x40
[    1.579420]  __do_fault+0x1b/0x60
[    1.579420]  __handle_mm_fault+0x641/0x8b0
[    1.579420]  handle_mm_fault+0x87/0x130
[    1.579420]  __do_page_fault+0x25f/0x4a0
[    1.579420]  trace_do_page_fault+0x37/0xd0
[    1.579420]  do_async_page_fault+0x23/0x80
[    1.579420]  async_page_fault+0x28/0x30
[    1.579420] RIP: 0033:0x7f81be514ab0
[    1.579420] RSP: 002b:00007ffc8de73768 EFLAGS: 00010202
[    1.579420] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[    1.579420] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007ffc8de73770
[    1.579420] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[    1.579420] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[    1.579420] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[    1.579420] Dumping ftrace buffer:
[    1.579420]    (ftrace buffer empty)
[    1.579420] Kernel Offset: disabled
[    1.579420] Rebooting in 86400 seconds..


So let's keep it not to break CONFIG_SCHED_STACK_END_CHECK.

Again, I want to say kudos to PaX Team for his awesome code.
Best regards,
Alexander

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [kernel-hardening] Re: [RFC][PATCH 2/2] arm64: Clear the stack
  2017-07-21 16:56                     ` Alexander Popov
@ 2017-07-22  0:23                       ` Laura Abbott
  2017-07-24  8:19                         ` Alexander Popov
  0 siblings, 1 reply; 30+ messages in thread
From: Laura Abbott @ 2017-07-22  0:23 UTC (permalink / raw)
  To: alex.popov, Mark Rutland
  Cc: Kees Cook, kernel-hardening, Ard Biesheuvel, Tycho Andersen, PaX Team

On 07/21/2017 09:56 AM, Alexander Popov wrote:
> Hello Laura and Mark,
> 
> [+ Tycho Andersen and Pax Team]
> 
> On 14.07.2017 23:51, Laura Abbott wrote:
>> On 07/11/2017 12:51 PM, Mark Rutland wrote:
>>> On Mon, Jul 10, 2017 at 03:04:43PM -0700, Laura Abbott wrote:
>>>> +	/* The poison function */
>>>> +4:
>>>> +	/* Generate the address we found */
>>>> +	add     x5, x1, x3, lsl #3
>>>> +	orr     x5, x5, #16
>>>
>>> Have you figured out what the forced 16 byte offset is for?
>>>
>>> I've not managed to figure out why it's there, and it looks like
>>> Alexander couldn't either, judging by his comments in the x86 asm.
>>
>> From get_wchan in arch/x86/kernel/process.c, it might be be to
>> account for the start of the frame correctly? I don't have a
>> definitive answer though and plan on just removing this for the
>> next version.
> 
> I've investigated it carefully: we should not remove this 16-byte offset.
> 
> I looked at the bottom of the kernel stack with the debugger and found a
> non-zero value 0x57AC6E9D. It is STACK_END_MAGIC, which is defined in
> include/uapi/linux/magic.h. This value is checked if we enable
> CONFIG_SCHED_STACK_END_CHECK.
> 
> Then I removed this 16-byte offset in stackleak patch (including the OR
> instruction in erase_kstack()) to see how the first erase_kstack() call happily
> overwrites that magic value:
> 
> [    1.574655] Freeing unused kernel memory: 1244K
> [    1.575026] Kernel memory protection disabled.
> [    1.578575] Kernel panic - not syncing: corrupted stack end detected inside
> scheduler
> [    1.578575]
> [    1.579420] CPU: 0 PID: 1 Comm: init Not tainted 4.12.0+ #9
> [    1.579420] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> Ubuntu-1.8.2-1ubuntu1 04/01/2014
> [    1.579420] Call Trace:
> [    1.579420]  dump_stack+0x63/0x81
> [    1.579420]  panic+0xd0/0x212
> [    1.579420]  __schedule+0x6d8/0x6e0
> [    1.579420]  schedule+0x31/0x80
> [    1.579420]  io_schedule+0x11/0x40
> [    1.579420]  __lock_page_or_retry+0x17d/0x2b0
> [    1.579420]  ? page_cache_tree_insert+0x90/0x90
> [    1.579420]  filemap_fault+0x3aa/0x5c0
> [    1.579420]  ? filemap_map_pages+0x1a6/0x280
> [    1.579420]  ext4_filemap_fault+0x2d/0x40
> [    1.579420]  __do_fault+0x1b/0x60
> [    1.579420]  __handle_mm_fault+0x641/0x8b0
> [    1.579420]  handle_mm_fault+0x87/0x130
> [    1.579420]  __do_page_fault+0x25f/0x4a0
> [    1.579420]  trace_do_page_fault+0x37/0xd0
> [    1.579420]  do_async_page_fault+0x23/0x80
> [    1.579420]  async_page_fault+0x28/0x30
> [    1.579420] RIP: 0033:0x7f81be514ab0
> [    1.579420] RSP: 002b:00007ffc8de73768 EFLAGS: 00010202
> [    1.579420] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [    1.579420] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007ffc8de73770
> [    1.579420] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
> [    1.579420] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> [    1.579420] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [    1.579420] Dumping ftrace buffer:
> [    1.579420]    (ftrace buffer empty)
> [    1.579420] Kernel Offset: disabled
> [    1.579420] Rebooting in 86400 seconds..
> 
> 
> So let's keep it not to break CONFIG_SCHED_STACK_END_CHECK.
> 


That makes sense, good find! I wonder if CONFIG_SCHED_STACK_END_CHECK
should go on the list of hardening options and/or if we can enhance
it somehow?

I'm not sure why it requires two words though since the
poison only seems to be 32-bits?

> Again, I want to say kudos to PaX Team for his awesome code.

Agreed!

> Best regards,
> Alexander
> 

Thanks,
Laura

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [kernel-hardening] Re: [RFC][PATCH 2/2] arm64: Clear the stack
  2017-07-22  0:23                       ` Laura Abbott
@ 2017-07-24  8:19                         ` Alexander Popov
  2017-07-25  3:34                           ` Kees Cook
  0 siblings, 1 reply; 30+ messages in thread
From: Alexander Popov @ 2017-07-24  8:19 UTC (permalink / raw)
  To: Laura Abbott, Mark Rutland
  Cc: Kees Cook, kernel-hardening, Ard Biesheuvel, Tycho Andersen, PaX Team

On 22.07.2017 03:23, Laura Abbott wrote:
> On 07/21/2017 09:56 AM, Alexander Popov wrote:
>> So let's keep it not to break CONFIG_SCHED_STACK_END_CHECK.
> 
> That makes sense, good find! I wonder if CONFIG_SCHED_STACK_END_CHECK
> should go on the list of hardening options and/or if we can enhance
> it somehow?

Do you mean this list?
http://www.kernsec.org/wiki/index.php/Kernel_Self_Protection_Project/Recommended_Settings

> I'm not sure why it requires two words though since the
> poison only seems to be 32-bits?

On x86_64 end_of_stack() returns the pointer to unsigned long, so we need at
least 8 bytes to avoid breaking CONFIG_SCHED_STACK_END_CHECK. Don't know about 8
more bytes.

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [kernel-hardening] Re: [RFC][PATCH 2/2] arm64: Clear the stack
  2017-07-24  8:19                         ` Alexander Popov
@ 2017-07-25  3:34                           ` Kees Cook
  2017-08-18  8:07                             ` Alexander Popov
  0 siblings, 1 reply; 30+ messages in thread
From: Kees Cook @ 2017-07-25  3:34 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Laura Abbott, Mark Rutland, kernel-hardening, Ard Biesheuvel,
	Tycho Andersen, PaX Team

On Mon, Jul 24, 2017 at 1:19 AM, Alexander Popov <alex.popov@linux.com> wrote:
> On 22.07.2017 03:23, Laura Abbott wrote:
>> On 07/21/2017 09:56 AM, Alexander Popov wrote:
>>> So let's keep it not to break CONFIG_SCHED_STACK_END_CHECK.
>>
>> That makes sense, good find! I wonder if CONFIG_SCHED_STACK_END_CHECK
>> should go on the list of hardening options and/or if we can enhance
>> it somehow?
>
> Do you mean this list?
> http://www.kernsec.org/wiki/index.php/Kernel_Self_Protection_Project/Recommended_Settings
>
>> I'm not sure why it requires two words though since the
>> poison only seems to be 32-bits?
>
> On x86_64 end_of_stack() returns the pointer to unsigned long, so we need at
> least 8 bytes to avoid breaking CONFIG_SCHED_STACK_END_CHECK. Don't know about 8
> more bytes.

Seems like this should be a random value like the per-frame stack
canary, but it looks like a relatively cheap check. It's probably not
in the cache, though, since most stack operations should be pretty far
from the end of the stack...

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [kernel-hardening] Re: [RFC][PATCH 2/2] arm64: Clear the stack
  2017-07-25  3:34                           ` Kees Cook
@ 2017-08-18  8:07                             ` Alexander Popov
  0 siblings, 0 replies; 30+ messages in thread
From: Alexander Popov @ 2017-08-18  8:07 UTC (permalink / raw)
  To: Kees Cook
  Cc: Laura Abbott, Mark Rutland, kernel-hardening, Ard Biesheuvel,
	Tycho Andersen, PaX Team

Hello Kees and Laura,

On 25.07.2017 06:34, Kees Cook wrote:
> On Mon, Jul 24, 2017 at 1:19 AM, Alexander Popov <alex.popov@linux.com> wrote:
>> On 22.07.2017 03:23, Laura Abbott wrote:
>>> On 07/21/2017 09:56 AM, Alexander Popov wrote:
>>>> So let's keep it not to break CONFIG_SCHED_STACK_END_CHECK.
>>>
>>> That makes sense, good find! I wonder if CONFIG_SCHED_STACK_END_CHECK
>>> should go on the list of hardening options and/or if we can enhance
>>> it somehow?
>>
>> Do you mean this list?
>> http://www.kernsec.org/wiki/index.php/Kernel_Self_Protection_Project/Recommended_Settings
>>
>>> I'm not sure why it requires two words though since the
>>> poison only seems to be 32-bits?
>>
>> On x86_64 end_of_stack() returns the pointer to unsigned long, so we need at
>> least 8 bytes to avoid breaking CONFIG_SCHED_STACK_END_CHECK. Don't know about 8
>> more bytes.
> 
> Seems like this should be a random value like the per-frame stack
> canary, but it looks like a relatively cheap check. It's probably not
> in the cache, though, since most stack operations should be pretty far
> from the end of the stack...

It seems that the idea you describe is not implemented in Grsecurity/PaX patch.

On x86_64 the bottom of the stack for the mainline kernel looks like that:
0xffffc900004c8000: 0x57ac6e9d 0x00000000 0x00000000 0x00000000
0xffffc900004c8010: 0xffff4111 0xffffffff 0xffff4111 0xffffffff
...

But for the Grsecurity kernel it looks like that:
0xffffc90000324000: 0x00000000 0x00000000 0x57ac6e9d 0x00000000
0xffffc90000324010: 0xffff4111 0xffffffff 0xffff4111 0xffffffff
...

Because Grsecurity/PaX patch has that change:
 static inline unsigned long *end_of_stack(const struct task_struct *task)
 {
-	return task->stack;
+	return (unsigned long *)task->stack + 1;
 }

So that is their reason of having 16 bytes reserved at the bottom of the kernel
stack.

However, right now I don't understand the purpose of patching end_of_stack().
What do you think? Should mainline have it?

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2017-08-18  8:07 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-09 14:30 [kernel-hardening] [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls Alexander Popov
2017-06-09 17:28 ` [kernel-hardening] " Kees Cook
2017-06-09 23:00   ` Alexander Popov
2017-06-20 19:20     ` Kees Cook
2017-06-13 21:51   ` Laura Abbott
2017-06-20 11:20     ` Mark Rutland
2017-06-20 14:13       ` Ard Biesheuvel
2017-06-20 19:11       ` Kees Cook
2017-06-21  9:24         ` Mark Rutland
2017-06-21 15:54           ` Laura Abbott
2017-07-10 22:04             ` [kernel-hardening] [RFC][PATCH 0/2] draft of stack clearing for arm64 Laura Abbott
2017-07-10 22:04               ` [kernel-hardening] [RFC][PATCH 1/2] stackleak: Update " Laura Abbott
2017-07-10 22:04               ` [kernel-hardening] [RFC][PATCH 2/2] arm64: Clear the stack Laura Abbott
2017-07-11 19:51                 ` [kernel-hardening] " Mark Rutland
2017-07-11 20:04                   ` Mark Rutland
2017-07-12  6:01                   ` Alexander Popov
2017-07-14 20:51                   ` Laura Abbott
2017-07-21 16:56                     ` Alexander Popov
2017-07-22  0:23                       ` Laura Abbott
2017-07-24  8:19                         ` Alexander Popov
2017-07-25  3:34                           ` Kees Cook
2017-08-18  8:07                             ` Alexander Popov
2017-07-11 22:56               ` [kernel-hardening] Re: [RFC][PATCH 0/2] draft of stack clearing for arm64 Alexander Popov
2017-06-23 22:48   ` [kernel-hardening] Re: [PATCH RFC v2 1/1] gcc-plugins: Add stackleak feature erasing the kernel stack at the end of syscalls Tycho Andersen
2017-06-29 21:33     ` Kees Cook
2017-06-29 22:13       ` Tycho Andersen
2017-06-20  9:06 ` [kernel-hardening] " Hector Martin "marcan"
2017-06-20 19:07   ` Kees Cook
2017-06-20 20:22     ` Hector Martin "marcan"
2017-06-20 19:14 ` [kernel-hardening] " Kees Cook

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.