All of lore.kernel.org
 help / color / mirror / Atom feed
* [kernel-hardening] [PATCH RFC v5 0/5] Introduce the STACKLEAK feature and a test for it
@ 2017-10-22  0:22 Alexander Popov
  2017-10-22  0:22 ` [kernel-hardening] [PATCH RFC v5 1/5] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls Alexander Popov
                   ` (5 more replies)
  0 siblings, 6 replies; 24+ messages in thread
From: Alexander Popov @ 2017-10-22  0:22 UTC (permalink / raw)
  To: kernel-hardening, keescook, pageexec, spender, Ingo Molnar,
	Andy Lutomirski, tycho, Laura Abbott, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, Peter Zijlstra, x86, alex.popov

This is the 5th version of the patch series introducing STACKLEAK to the
mainline kernel. STACKLEAK is a security feature developed by Grsecurity/PaX
(kudos to them), which:
 - reduces the information that can be revealed through kernel stack leak bugs;
 - blocks some uninitialized stack variable attacks (e.g. CVE-2010-2963);
 - introduces some runtime checks for kernel stack overflow detection.

STACKLEAK instrumentation statistics
====================================

These numbers were obtained for the 4th version of the patch series.

Size of vmlinux (x86_64_defconfig):
 file size:
  - STACKLEAK disabled: 35014784 bytes
  - STACKLEAK enabled: 35044952 bytes (+0.086%)
 .text section size (calculated by size utility):
  - STACKLEAK disabled: 10752983
  - STACKLEAK enabled: 11062221 (+2.876%)

The readelf utility shows 45602 functions in vmlinux. The STACKLEAK gcc plugin
inserted 36 check_alloca() calls and 1265 track_stack() calls (42274 calls are
inserted during GIMPLE pass and 41009 calls are deleted during RTL pass).
So 2.853% of functions are instrumented.

Further work
=============

 - Think of erasing stack on the way out of exception handlers (idea by
   Andy Lutomirski).
 - Rewrite erase_kstack() in C (if Ingo Molnar insists).
 - Think of erasing the kernel stack after invoking EFI runtime services
   (idea by Mark Rutland).
 - Try to port STACKLEAK to arm64 (Laura Abbott is working on it).

Changes in v5 (mostly according to the feedback from Ingo Molnar)
=================================================================

1. Introduced the CONFIG_STACKLEAK_METRICS providing STACKLEAK information
   about tasks via the /proc file system. That information can be useful for
   estimating the STACKLEAK performance impact for different workloads.
   In particular, /proc/<pid>/lowest_stack shows the current lowest_stack
   value and its final value from the previous syscall.

2. Introduced a single check_alloca() implementation working for both
   x86_64 and x86_32.

3. Fixed coding style issues and did some refactoring in the STACKLEAK
   gcc plugin.

4. Put the gcc plugin and the kernel stack erasing into separate (working)
   patches.

Changes in v4
=============

1. Introduced the CONFIG_STACKLEAK_TRACK_MIN_SIZE parameter instead of
   hard-coded track-lowest-sp.

2. Carefully looked into the assertions in track_stack() and check_alloca().
    - Fixed the incorrect BUG() condition in track_stack(), thanks to Tycho
       Andersen. Also disabled that check if CONFIG_VMAP_STACK is enabled.
    - Fixed the surplus and erroneous code for calculating stack_left in
       check_alloca() on x86_64. That code repeats the work which is already
       done in get_stack_info() and it misses the fact that different
       exception stacks on x86_64 have different size.

3. Introduced two lkdtm tests for the STACKLEAK feature (developed together
   with Tycho Andersen).

4. Added info about STACKLEAK to Documentation/security/self-protection.rst.

5. Improved the comments.

6. Decided not to change "unsigned long sp = (unsigned long)&sp" to
   current_stack_pointer. The original variant is more platform independent
   since current_stack_pointer has different type on x86 and arm.

Changes in v3
=============

1. Added the detailed comments describing the STACKLEAK gcc plugin.
   Read the plugin from the bottom up, like you do for Linux kernel drivers.
   Additional information:
    - gcc internals documentation available from gcc sources;
    - wonderful slides by Diego Novillo
       https://www.airs.com/dnovillo/200711-GCC-Internals/
    - nice introduction to gcc plugins at LWN
       https://lwn.net/Articles/457543/

2. Improved the commit message and Kconfig description according to the
   feedback from Kees Cook. Also added the original notice describing
   the performance impact of the STACKLEAK feature.

3. Removed the arch-specific ix86_cmodel check in stackleak_track_stack_gate().
   It caused skipping the kernel code instrumentation for i386.

4. Fixed a minor mistake in stackleak_tree_instrument_execute().
   First versions of the plugin used ENTRY_BLOCK_PTR_FOR_FN(cfun)->next_bb
   to get the basic block with the function prologue. That was not correct
   since the control flow graph edge from ENTRY_BLOCK_PTR doesn't always
   go to that basic block.

   So later it was changed to single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun)),
   but not completely. next_bb was still used for entry_bb assignment,
   which could cause the wrong value of the prologue_instrumented variable.

   I've reported this issue to Grsecurity and proposed the fix for it, but
   unfortunately didn't get any reply.

5. Introduced the STACKLEAK_POISON macro and renamed the config option
   according to the feedback from Kees Cook.


Alexander Popov (5):
  x86/entry: Add STACKLEAK erasing the kernel stack at the end of
    syscalls
  gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
  lkdtm: Add a test for STACKLEAK
  fs/proc: Show STACKLEAK metrics in the /proc file system
  doc: self-protection: Add information about STACKLEAK feature

 Documentation/security/self-protection.rst |  23 +-
 arch/Kconfig                               |  50 +++
 arch/x86/Kconfig                           |   1 +
 arch/x86/entry/common.c                    |  17 +-
 arch/x86/entry/entry_32.S                  |  73 +++++
 arch/x86/entry/entry_64.S                  |  99 ++++++
 arch/x86/entry/entry_64_compat.S           |   8 +
 arch/x86/include/asm/processor.h           |   7 +
 arch/x86/kernel/asm-offsets.c              |  12 +
 arch/x86/kernel/dumpstack.c                |  15 +
 arch/x86/kernel/process_32.c               |   8 +
 arch/x86/kernel/process_64.c               |   8 +
 drivers/misc/Makefile                      |   3 +
 drivers/misc/lkdtm.h                       |   4 +
 drivers/misc/lkdtm_core.c                  |   2 +
 drivers/misc/lkdtm_stackleak.c             | 139 +++++++++
 fs/exec.c                                  |  30 ++
 fs/proc/base.c                             |  14 +
 include/linux/compiler.h                   |   4 +
 scripts/Makefile.gcc-plugins               |   3 +
 scripts/gcc-plugins/stackleak_plugin.c     | 470 +++++++++++++++++++++++++++++
 21 files changed, 978 insertions(+), 12 deletions(-)
 create mode 100644 drivers/misc/lkdtm_stackleak.c
 create mode 100644 scripts/gcc-plugins/stackleak_plugin.c

-- 
2.7.4

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [kernel-hardening] [PATCH RFC v5 1/5] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2017-10-22  0:22 [kernel-hardening] [PATCH RFC v5 0/5] Introduce the STACKLEAK feature and a test for it Alexander Popov
@ 2017-10-22  0:22 ` Alexander Popov
  2017-10-23 13:17   ` [kernel-hardening] " Tycho Andersen
  2017-10-22  0:22 ` [kernel-hardening] [PATCH RFC v5 2/5] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack Alexander Popov
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Alexander Popov @ 2017-10-22  0:22 UTC (permalink / raw)
  To: kernel-hardening, keescook, pageexec, spender, Ingo Molnar,
	Andy Lutomirski, tycho, Laura Abbott, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, Peter Zijlstra, x86, alex.popov

The STACKLEAK feature erases the kernel stack before returning from
syscalls. That reduces the information which kernel stack leak bugs can
reveal and blocks some uninitialized stack variable attacks. Moreover,
STACKLEAK provides runtime checks for kernel stack overflow detection.

This commit introduces the architecture-specific code filling the used
part of the kernel stack with a poison value before returning to the
userspace. Full STACKLEAK feature also contains the gcc plugin which
comes in a separate commit.

The STACKLEAK feature is ported from grsecurity/PaX. More information at:
  https://grsecurity.net/
  https://pax.grsecurity.net/

This code is modified from Brad Spengler/PaX Team's code in the last
public patch of grsecurity/PaX based on our understanding of the code.
Changes or omissions from the original code are ours and don't reflect
the original grsecurity/PaX code.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 arch/Kconfig                     | 27 ++++++++++++
 arch/x86/Kconfig                 |  1 +
 arch/x86/entry/common.c          | 17 +++++--
 arch/x86/entry/entry_32.S        | 69 +++++++++++++++++++++++++++++
 arch/x86/entry/entry_64.S        | 95 ++++++++++++++++++++++++++++++++++++++++
 arch/x86/entry/entry_64_compat.S |  8 ++++
 arch/x86/include/asm/processor.h |  4 ++
 arch/x86/kernel/asm-offsets.c    |  9 ++++
 arch/x86/kernel/process_32.c     |  5 +++
 arch/x86/kernel/process_64.c     |  5 +++
 include/linux/compiler.h         |  4 ++
 11 files changed, 241 insertions(+), 3 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index d789a89..e9ec94c 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -386,6 +386,13 @@ config SECCOMP_FILTER
 
 	  See Documentation/prctl/seccomp_filter.txt for details.
 
+config HAVE_ARCH_STACKLEAK
+	bool
+	help
+	  An architecture should select this if it has the code which
+	  fills the used part of the kernel stack with the STACKLEAK_POISON
+	  value before returning from system calls.
+
 config HAVE_GCC_PLUGINS
 	bool
 	help
@@ -516,6 +523,26 @@ config GCC_PLUGIN_RANDSTRUCT_PERFORMANCE
 	  in structures.  This reduces the performance hit of RANDSTRUCT
 	  at the cost of weakened randomization.
 
+config GCC_PLUGIN_STACKLEAK
+	bool "Erase the kernel stack before returning from syscalls"
+	depends on GCC_PLUGINS
+	depends on HAVE_ARCH_STACKLEAK
+	help
+	  This option makes the kernel erase the kernel stack before it
+	  returns from a system call. That reduces the information which
+	  kernel stack leak bugs can reveal and blocks some uninitialized
+	  stack variable attacks. This option also provides runtime checks
+	  for kernel stack overflow detection.
+
+	  The tradeoff is the performance impact: on a single CPU system kernel
+	  compilation sees a 1% slowdown, other systems and workloads may vary
+	  and you are advised to test this feature on your expected workload
+	  before deploying it.
+
+	  This plugin was ported from grsecurity/PaX. More information at:
+	   * https://grsecurity.net/
+	   * https://pax.grsecurity.net/
+
 config HAVE_CC_STACKPROTECTOR
 	bool
 	help
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 971feac..b7da58f 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -114,6 +114,7 @@ config X86
 	select HAVE_ARCH_MMAP_RND_COMPAT_BITS	if MMU && COMPAT
 	select HAVE_ARCH_COMPAT_MMAP_BASES	if MMU && COMPAT
 	select HAVE_ARCH_SECCOMP_FILTER
+	select HAVE_ARCH_STACKLEAK
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD if X86_64
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 03505ff..075487e 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -45,6 +45,12 @@ __visible inline void enter_from_user_mode(void)
 static inline void enter_from_user_mode(void) {}
 #endif
 
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+asmlinkage void erase_kstack(void);
+#else
+static void erase_kstack(void) {}
+#endif
+
 static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch)
 {
 #ifdef CONFIG_X86_64
@@ -81,8 +87,10 @@ static long syscall_trace_enter(struct pt_regs *regs)
 		emulated = true;
 
 	if ((emulated || (work & _TIF_SYSCALL_TRACE)) &&
-	    tracehook_report_syscall_entry(regs))
+	    tracehook_report_syscall_entry(regs)) {
+		erase_kstack();
 		return -1L;
+	}
 
 	if (emulated)
 		return -1L;
@@ -116,9 +124,11 @@ static long syscall_trace_enter(struct pt_regs *regs)
 			sd.args[5] = regs->bp;
 		}
 
-		ret = __secure_computing(&sd);
-		if (ret == -1)
+		ret = secure_computing(&sd);
+		if (ret == -1) {
+			erase_kstack();
 			return ret;
+		}
 	}
 #endif
 
@@ -127,6 +137,7 @@ static long syscall_trace_enter(struct pt_regs *regs)
 
 	do_audit_syscall_entry(regs, arch);
 
+	erase_kstack();
 	return ret ?: regs->orig_ax;
 }
 
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 50e0d2b..a7b0c52 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -75,6 +75,71 @@
 #endif
 .endm
 
+.macro erase_kstack
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	call erase_kstack
+#endif
+.endm
+
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+/* For the detailed comments, see erase_kstack in entry_64.S */
+ENTRY(erase_kstack)
+	pushl	%edi
+	pushl	%ecx
+	pushl	%eax
+	pushl	%ebp
+
+	movl	PER_CPU_VAR(current_task), %ebp
+	mov	TASK_lowest_stack(%ebp), %edi
+	mov	$STACKLEAK_POISON, %eax
+	std
+
+1:
+	mov	%edi, %ecx
+	and	$THREAD_SIZE_asm - 1, %ecx
+	shr	$2, %ecx
+	repne	scasl
+	jecxz	2f
+
+	cmp	$2*16, %ecx
+	jc	2f
+
+	mov	$2*16, %ecx
+	repe	scasl
+	jecxz	2f
+	jne	1b
+
+2:
+	cld
+	or	$2*4, %edi
+	mov	%esp, %ecx
+	sub	%edi, %ecx
+
+	cmp	$THREAD_SIZE_asm, %ecx
+	jb	3f
+	ud2
+
+3:
+	shr	$2, %ecx
+	rep	stosl
+
+	/*
+	 * TODO: sp0 on x86_32 is not reliable, right?
+	 * Doubt because of the definition of cpu_current_top_of_stack
+	 * in arch/x86/kernel/cpu/common.c.
+	 */
+	mov	TASK_thread_sp0(%ebp), %edi
+	sub	$128, %edi
+	mov	%edi, TASK_lowest_stack(%ebp)
+
+	popl	%ebp
+	popl	%eax
+	popl	%ecx
+	popl	%edi
+	ret
+ENDPROC(erase_kstack)
+#endif
+
 /*
  * User gs save/restore
  *
@@ -445,6 +510,8 @@ ENTRY(entry_SYSENTER_32)
 	ALTERNATIVE "testl %eax, %eax; jz .Lsyscall_32_done", \
 		    "jmp .Lsyscall_32_done", X86_FEATURE_XENPV
 
+	erase_kstack
+
 /* Opportunistic SYSEXIT */
 	TRACE_IRQS_ON			/* User mode traces as IRQs on. */
 	movl	PT_EIP(%esp), %edx	/* pt_regs->ip */
@@ -531,6 +598,8 @@ ENTRY(entry_INT80_32)
 	call	do_int80_syscall_32
 .Lsyscall_32_done:
 
+	erase_kstack
+
 restore_all:
 	TRACE_IRQS_IRET
 .Lrestore_all_notrace:
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 4916725..189d843 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -59,6 +59,90 @@ END(native_usergs_sysret64)
 #endif
 .endm
 
+.macro erase_kstack
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	call erase_kstack
+#endif
+.endm
+
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+ENTRY(erase_kstack)
+	pushq	%rdi
+	pushq	%rcx
+	pushq	%rax
+	pushq	%r11
+
+	movq	PER_CPU_VAR(current_task), %r11
+	mov	TASK_lowest_stack(%r11), %rdi
+	mov	$STACKLEAK_POISON, %rax
+	std
+
+	/*
+	 * Let's search for the poison value in the stack.
+	 * Start from the lowest_stack and go to the bottom (see std above).
+	 */
+1:
+	mov	%edi, %ecx
+	and	$THREAD_SIZE_asm - 1, %ecx
+	shr	$3, %ecx
+	repne	scasq
+	jecxz	2f	/* Didn't find it. Go to poisoning. */
+
+	/*
+	 * Found the poison value in the stack. Go to poisoning if there are
+	 * less than 16 qwords left.
+	 */
+	cmp	$2*8, %ecx
+	jc	2f
+
+	/*
+	 * Check that 16 further qwords contain poison (avoid false positives).
+	 * If so, the part of the stack below the address in %rdi is likely
+	 * to be poisoned. Otherwise we need to search deeper.
+	 */
+	mov	$2*8, %ecx
+	repe	scasq
+	jecxz	2f	/* Poison the upper part of the stack. */
+	jne	1b	/* Search deeper. */
+
+2:
+	/*
+	 * Prepare the counter for poisoning the kernel stack between
+	 * %rdi and %rsp. Two qwords at the bottom of the stack are reserved
+	 * and should not be poisoned (see CONFIG_SCHED_STACK_END_CHECK).
+	 */
+	cld
+	or	$2*8, %rdi
+	mov	%esp, %ecx
+	sub	%edi, %ecx
+
+	/* Check that the counter value is sane. */
+	cmp	$THREAD_SIZE_asm, %rcx
+	jb	3f
+	ud2
+
+3:
+	/*
+	 * So let's write the poison value to the kernel stack. Start from the
+	 * address in %rdi and move up (see cld above) to the address in %rsp
+	 * (not included, used memory).
+	 */
+	shr	$3, %ecx
+	rep	stosq
+
+	/* Set the lowest_stack value to the top_of_stack - 256. */
+	mov	TASK_thread_sp0(%r11), %rdi
+	sub	$256, %rdi
+	mov	%rdi, TASK_lowest_stack(%r11)
+
+	popq	%r11
+	popq	%rax
+	popq	%rcx
+	popq	%rdi
+	ret
+ENDPROC(erase_kstack)
+#endif
+
 /*
  * When dynamic function tracer is enabled it will add a breakpoint
  * to all locations that it is about to modify, sync CPUs, update
@@ -216,6 +300,8 @@ entry_SYSCALL_64_fastpath:
 	testl	$_TIF_ALLWORK_MASK, TASK_TI_flags(%r11)
 	jnz	1f
 
+	erase_kstack
+
 	LOCKDEP_SYS_EXIT
 	TRACE_IRQS_ON		/* user mode is traced as IRQs on */
 	movq	RIP(%rsp), %rcx
@@ -245,6 +331,8 @@ entry_SYSCALL64_slow_path:
 	call	do_syscall_64		/* returns with IRQs disabled */
 
 return_from_SYSCALL_64:
+	erase_kstack
+
 	RESTORE_EXTRA_REGS
 	TRACE_IRQS_IRETQ		/* we're about to change IF */
 
@@ -421,6 +509,7 @@ ENTRY(ret_from_fork)
 	UNWIND_HINT_REGS
 	movq	%rsp, %rdi
 	call	syscall_return_slowpath	/* returns with IRQs disabled */
+	erase_kstack
 	TRACE_IRQS_ON			/* user mode is traced as IRQS on */
 	SWAPGS
 	jmp	restore_regs_and_iret
@@ -610,6 +699,12 @@ ret_from_intr:
 GLOBAL(retint_user)
 	mov	%rsp,%rdi
 	call	prepare_exit_to_usermode
+
+	/*
+	 * TODO: Do we need to call erase_kstack here?
+	 * The PaX patch has it here commented out.
+	 */
+
 	TRACE_IRQS_IRETQ
 	SWAPGS
 	jmp	restore_regs_and_iret
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index e26c25c..f79cbf4 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -18,6 +18,12 @@
 
 	.section .entry.text, "ax"
 
+	.macro erase_kstack
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	call erase_kstack
+#endif
+	.endm
+
 /*
  * 32-bit SYSENTER entry.
  *
@@ -228,6 +234,7 @@ GLOBAL(entry_SYSCALL_compat_after_hwframe)
 
 	/* Opportunistic SYSRET */
 sysret32_from_system_call:
+	erase_kstack
 	TRACE_IRQS_ON			/* User mode traces as IRQs on. */
 	movq	RBX(%rsp), %rbx		/* pt_regs->rbx */
 	movq	RBP(%rsp), %rbp		/* pt_regs->rbp */
@@ -335,6 +342,7 @@ ENTRY(entry_INT80_compat)
 .Lsyscall_32_done:
 
 	/* Go back to user mode. */
+	erase_kstack
 	TRACE_IRQS_ON
 	SWAPGS
 	jmp	restore_regs_and_iret
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index b390ff7..c6eaf2d 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -477,6 +477,10 @@ struct thread_struct {
 
 	mm_segment_t		addr_limit;
 
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	unsigned long		lowest_stack;
+#endif
+
 	unsigned int		sig_on_uaccess_err:1;
 	unsigned int		uaccess_err:1;	/* uaccess failed */
 
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index de827d6..4ed7451 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -37,6 +37,10 @@ void common(void) {
 	BLANK();
 	OFFSET(TASK_TI_flags, task_struct, thread_info.flags);
 	OFFSET(TASK_addr_limit, task_struct, thread.addr_limit);
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	OFFSET(TASK_lowest_stack, task_struct, thread.lowest_stack);
+	OFFSET(TASK_thread_sp0, task_struct, thread.sp0);
+#endif
 
 	BLANK();
 	OFFSET(crypto_tfm_ctx_offset, crypto_tfm, __crt_ctx);
@@ -73,6 +77,11 @@ void common(void) {
 	OFFSET(PV_MMU_read_cr2, pv_mmu_ops, read_cr2);
 #endif
 
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	BLANK();
+	DEFINE(THREAD_SIZE_asm, THREAD_SIZE);
+#endif
+
 #ifdef CONFIG_XEN
 	BLANK();
 	OFFSET(XEN_vcpu_info_mask, vcpu_info, evtchn_upcall_mask);
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 1196625..c7345d2 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -136,6 +136,11 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
 	p->thread.sp0 = (unsigned long) (childregs+1);
 	memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps));
 
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	p->thread.lowest_stack = (unsigned long)task_stack_page(p) +
+						2 * sizeof(unsigned long);
+#endif
+
 	if (unlikely(p->flags & PF_KTHREAD)) {
 		/* kernel thread */
 		memset(childregs, 0, sizeof(struct pt_regs));
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 302e7b2..65ba73f 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -283,6 +283,11 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
 	p->thread.sp = (unsigned long) fork_frame;
 	p->thread.io_bitmap_ptr = NULL;
 
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	p->thread.lowest_stack = (unsigned long)task_stack_page(p) +
+						2 * sizeof(unsigned long);
+#endif
+
 	savesegment(gs, p->thread.gsindex);
 	p->thread.gsbase = p->thread.gsindex ? 0 : me->thread.gsbase;
 	savesegment(fs, p->thread.fsindex);
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index e95a263..916f02d 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -624,4 +624,8 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
 	(_________p1); \
 })
 
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+# define STACKLEAK_POISON -0xBEEF
+#endif
+
 #endif /* __LINUX_COMPILER_H */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [kernel-hardening] [PATCH RFC v5 2/5] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
  2017-10-22  0:22 [kernel-hardening] [PATCH RFC v5 0/5] Introduce the STACKLEAK feature and a test for it Alexander Popov
  2017-10-22  0:22 ` [kernel-hardening] [PATCH RFC v5 1/5] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls Alexander Popov
@ 2017-10-22  0:22 ` Alexander Popov
  2017-10-30 16:51   ` [kernel-hardening] " Alexander Popov
  2017-10-22  0:22 ` [kernel-hardening] [PATCH RFC v5 3/5] lkdtm: Add a test for STACKLEAK Alexander Popov
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Alexander Popov @ 2017-10-22  0:22 UTC (permalink / raw)
  To: kernel-hardening, keescook, pageexec, spender, Ingo Molnar,
	Andy Lutomirski, tycho, Laura Abbott, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, Peter Zijlstra, x86, alex.popov

The STACKLEAK feature erases the kernel stack before returning from
syscalls. That reduces the information which kernel stack leak bugs can
reveal and blocks some uninitialized stack variable attacks. Moreover,
STACKLEAK provides runtime checks for kernel stack overflow detection.

This commit introduces the STACKLEAK gcc plugin. It is needed for:
 - tracking the lowest border of the kernel stack, which is important
    for the code erasing the used part of the kernel stack at the end
    of syscalls (comes in a separate commit);
 - checking that alloca calls don't cause stack overflow.

So this plugin instruments the kernel code inserting:
 - the check_alloca() call before alloca and the track_stack() call
    after it;
 - the track_stack() call for the functions with a stack frame size
    greater than or equal to CONFIG_STACKLEAK_TRACK_MIN_SIZE.

The STACKLEAK feature is ported from grsecurity/PaX. More information at:
  https://grsecurity.net/
  https://pax.grsecurity.net/

This code is modified from Brad Spengler/PaX Team's code in the last
public patch of grsecurity/PaX based on our understanding of the code.
Changes or omissions from the original code are ours and don't reflect
the original grsecurity/PaX code.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 arch/Kconfig                           |  12 +
 arch/x86/kernel/dumpstack.c            |  15 ++
 fs/exec.c                              |  30 +++
 scripts/Makefile.gcc-plugins           |   3 +
 scripts/gcc-plugins/stackleak_plugin.c | 470 +++++++++++++++++++++++++++++++++
 5 files changed, 530 insertions(+)
 create mode 100644 scripts/gcc-plugins/stackleak_plugin.c

diff --git a/arch/Kconfig b/arch/Kconfig
index e9ec94c..f2de598 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -543,6 +543,18 @@ config GCC_PLUGIN_STACKLEAK
 	   * https://grsecurity.net/
 	   * https://pax.grsecurity.net/
 
+config STACKLEAK_TRACK_MIN_SIZE
+	int "Minimum stack frame size of functions tracked by STACKLEAK"
+	default 100
+	range 0 4096
+	depends on GCC_PLUGIN_STACKLEAK
+	help
+	  The STACKLEAK gcc plugin instruments the kernel code for tracking
+	  the lowest border of the kernel stack (and for some other purposes).
+	  It inserts the track_stack() call for the functions with a stack
+	  frame size greater than or equal to this parameter. If unsure,
+	  leave the default value 100.
+
 config HAVE_CC_STACKPROTECTOR
 	bool
 	help
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index f13b4c0..5a9b6cc 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -315,3 +315,18 @@ static int __init code_bytes_setup(char *s)
 	return 1;
 }
 __setup("code_bytes=", code_bytes_setup);
+
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+void __used check_alloca(unsigned long size)
+{
+	unsigned long sp = (unsigned long)&sp;
+	struct stack_info stack_info = {0};
+	unsigned long visit_mask = 0;
+	unsigned long stack_left;
+
+	BUG_ON(get_stack_info(&sp, current, &stack_info, &visit_mask));
+	stack_left = sp - (unsigned long)stack_info.begin;
+	BUG_ON(stack_left < 256 || size >= stack_left - 256);
+}
+EXPORT_SYMBOL(check_alloca);
+#endif
diff --git a/fs/exec.c b/fs/exec.c
index 3e14ba2..481ef4b 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1958,3 +1958,33 @@ COMPAT_SYSCALL_DEFINE5(execveat, int, fd,
 				  argv, envp, flags);
 }
 #endif
+
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+void __used track_stack(void)
+{
+	/*
+	 * N.B. The arch-specific part of the STACKLEAK feature fills the
+	 * kernel stack with the poison value, which has the register width.
+	 * That code assumes that the value of thread.lowest_stack is aligned
+	 * on the register width boundary.
+	 *
+	 * That is true for x86 and x86_64 because of the kernel stack
+	 * alignment on these platforms (for details, see cc_stack_align in
+	 * arch/x86/Makefile). Take care of that when you port STACKLEAK to
+	 * new platforms.
+	 */
+	unsigned long sp = (unsigned long)&sp;
+
+	if (sp < current->thread.lowest_stack &&
+	    sp >= (unsigned long)task_stack_page(current) +
+					2 * sizeof(unsigned long)) {
+		current->thread.lowest_stack = sp;
+	}
+
+#ifndef CONFIG_VMAP_STACK
+	if (unlikely((sp & (THREAD_SIZE - 1)) < (THREAD_SIZE / 16)))
+		BUG();
+#endif /* !CONFIG_VMAP_STACK */
+}
+EXPORT_SYMBOL(track_stack);
+#endif /* CONFIG_GCC_PLUGIN_STACKLEAK */
diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins
index d1f7b0d..3793c41 100644
--- a/scripts/Makefile.gcc-plugins
+++ b/scripts/Makefile.gcc-plugins
@@ -34,6 +34,9 @@ ifdef CONFIG_GCC_PLUGINS
   gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_RANDSTRUCT)	+= -DRANDSTRUCT_PLUGIN
   gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_RANDSTRUCT_PERFORMANCE)	+= -fplugin-arg-randomize_layout_plugin-performance-mode
 
+  gcc-plugin-$(CONFIG_GCC_PLUGIN_STACKLEAK)	+= stackleak_plugin.so
+  gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_STACKLEAK)	+= -DSTACKLEAK_PLUGIN -fplugin-arg-stackleak_plugin-track-min-size=$(CONFIG_STACKLEAK_TRACK_MIN_SIZE)
+
   GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y))
 
   export PLUGINCC GCC_PLUGINS_CFLAGS GCC_PLUGIN GCC_PLUGIN_SUBDIR
diff --git a/scripts/gcc-plugins/stackleak_plugin.c b/scripts/gcc-plugins/stackleak_plugin.c
new file mode 100644
index 0000000..1461d88
--- /dev/null
+++ b/scripts/gcc-plugins/stackleak_plugin.c
@@ -0,0 +1,470 @@
+/*
+ * Copyright 2011-2017 by the PaX Team <pageexec@freemail.hu>
+ * Modified by Alexander Popov <alex.popov@linux.com>
+ * Licensed under the GPL v2
+ *
+ * Note: the choice of the license means that the compilation process is
+ * NOT 'eligible' as defined by gcc's library exception to the GPL v3,
+ * but for the kernel it doesn't matter since it doesn't link against
+ * any of the gcc libraries
+ *
+ * This gcc plugin is needed for tracking the lowest border of the kernel stack
+ * and checking that alloca calls don't cause stack overflow. It instruments
+ * the kernel code inserting:
+ *  - the check_alloca() call before alloca and the track_stack() call after it;
+ *  - the track_stack() call for the functions with a stack frame size greater
+ *     than or equal to the "track-min-size" plugin parameter.
+ *
+ * This plugin is ported from grsecurity/PaX. For more information see:
+ *   https://grsecurity.net/
+ *   https://pax.grsecurity.net/
+ *
+ * Debugging:
+ *  - use fprintf() to stderr, debug_generic_expr(), debug_gimple_stmt()
+ *     and print_rtl();
+ *  - add "-fdump-tree-all -fdump-rtl-all" to the plugin CFLAGS in
+ *     Makefile.gcc-plugins to see the verbose dumps of the gcc passes;
+ *  - use gcc -E to understand the preprocessing shenanigans;
+ *  - use gcc with enabled CFG/GIMPLE/SSA verification (--enable-checking).
+ */
+
+#include "gcc-common.h"
+
+__visible int plugin_is_GPL_compatible;
+
+static int track_frame_size = -1;
+static const char track_function[] = "track_stack";
+static const char check_function[] = "check_alloca";
+
+/*
+ * Mark these global variables (roots) for gcc garbage collector since
+ * they point to the garbage-collected memory.
+ */
+static GTY(()) tree track_function_decl;
+static GTY(()) tree check_function_decl;
+
+static struct plugin_info stackleak_plugin_info = {
+	.version = "201707101337",
+	.help = "track-min-size=nn\ttrack stack for functions with a stack frame size >= nn bytes\n"
+};
+
+static void stackleak_check_alloca(gimple_stmt_iterator *gsi)
+{
+	gimple stmt;
+	gcall *check_alloca;
+	tree alloca_size;
+	cgraph_node_ptr node;
+	int frequency;
+	basic_block bb;
+
+	/* Insert call to void check_alloca(unsigned long size) */
+	alloca_size = gimple_call_arg(gsi_stmt(*gsi), 0);
+	stmt = gimple_build_call(check_function_decl, 1, alloca_size);
+	check_alloca = as_a_gcall(stmt);
+	gsi_insert_before(gsi, check_alloca, GSI_SAME_STMT);
+
+	/* Update the cgraph */
+	bb = gimple_bb(check_alloca);
+	node = cgraph_get_create_node(check_function_decl);
+	gcc_assert(node);
+	frequency = compute_call_stmt_bb_frequency(current_function_decl, bb);
+	cgraph_create_edge(cgraph_get_node(current_function_decl), node,
+			check_alloca, bb->count, frequency, bb->loop_depth);
+}
+
+static void stackleak_add_instrumentation(gimple_stmt_iterator *gsi, bool after)
+{
+	gimple stmt;
+	gcall *track_stack;
+	cgraph_node_ptr node;
+	int frequency;
+	basic_block bb;
+
+	/* Insert call to void track_stack(void) */
+	stmt = gimple_build_call(track_function_decl, 0);
+	track_stack = as_a_gcall(stmt);
+	if (after)
+		gsi_insert_after(gsi, track_stack, GSI_CONTINUE_LINKING);
+	else
+		gsi_insert_before(gsi, track_stack, GSI_SAME_STMT);
+
+	/* Update the cgraph */
+	bb = gimple_bb(track_stack);
+	node = cgraph_get_create_node(track_function_decl);
+	gcc_assert(node);
+	frequency = compute_call_stmt_bb_frequency(current_function_decl, bb);
+	cgraph_create_edge(cgraph_get_node(current_function_decl), node,
+			track_stack, bb->count, frequency, bb->loop_depth);
+}
+
+static bool is_alloca(gimple stmt)
+{
+	if (gimple_call_builtin_p(stmt, BUILT_IN_ALLOCA))
+		return true;
+
+#if BUILDING_GCC_VERSION >= 4007
+	if (gimple_call_builtin_p(stmt, BUILT_IN_ALLOCA_WITH_ALIGN))
+		return true;
+#endif
+
+	return false;
+}
+
+/*
+ * Work with the GIMPLE representation of the code.
+ * Insert the check_alloca() call before alloca and track_stack() call after
+ * it. Also insert track_stack() call into the beginning of the function
+ * if it is not instrumented.
+ */
+static unsigned int stackleak_tree_instrument_execute(void)
+{
+	basic_block bb, entry_bb;
+	bool prologue_instrumented = false, is_leaf = true;
+	gimple_stmt_iterator gsi;
+
+	/*
+	 * ENTRY_BLOCK_PTR is a basic block which represents possible entry
+	 * point of a function. This block does not contain any code and
+	 * has a CFG edge to its successor.
+	 */
+	gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+	entry_bb = single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun));
+
+	/*
+	 * 1. Loop through the GIMPLE statements in each of cfun basic blocks.
+	 * cfun is a global variable which represents the function that is
+	 * currently processed.
+	 */
+	FOR_EACH_BB_FN(bb, cfun) {
+		for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) {
+			gimple stmt;
+
+			stmt = gsi_stmt(gsi);
+
+			/* Leaf function is a function which makes no calls */
+			if (is_gimple_call(stmt))
+				is_leaf = false;
+
+			if (!is_alloca(stmt))
+				continue;
+
+			/* 2. Insert stack overflow check before alloca call */
+			stackleak_check_alloca(&gsi);
+
+			/* 3. Insert track_stack() call after alloca call */
+			stackleak_add_instrumentation(&gsi, true);
+			if (bb == entry_bb)
+				prologue_instrumented = true;
+		}
+	}
+
+	if (prologue_instrumented)
+		return 0;
+
+	/*
+	 * Special cases to skip the instrumentation.
+	 *
+	 * Taking the address of static inline functions materializes them,
+	 * but we mustn't instrument some of them as the resulting stack
+	 * alignment required by the function call ABI will break other
+	 * assumptions regarding the expected (but not otherwise enforced)
+	 * register clobbering ABI.
+	 *
+	 * Case in point: native_save_fl on amd64 when optimized for size
+	 * clobbers rdx if it were instrumented here.
+	 *
+	 * TODO: any more special cases?
+	 */
+	if (is_leaf &&
+	    !TREE_PUBLIC(current_function_decl) &&
+	    DECL_DECLARED_INLINE_P(current_function_decl)) {
+		return 0;
+	}
+
+	if (is_leaf &&
+	    !strncmp(IDENTIFIER_POINTER(DECL_NAME(current_function_decl)),
+		     "_paravirt_", 10)) {
+		return 0;
+	}
+
+	/* 4. Insert track_stack() call at the function beginning */
+	bb = entry_bb;
+	if (!single_pred_p(bb)) {
+		/* gcc_assert(bb_loop_depth(bb) ||
+				(bb->flags & BB_IRREDUCIBLE_LOOP)); */
+		split_edge(single_succ_edge(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+		gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+		bb = single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun));
+	}
+	gsi = gsi_after_labels(bb);
+	stackleak_add_instrumentation(&gsi, false);
+
+	return 0;
+}
+
+/*
+ * Work with the RTL representation of the code.
+ * Remove the unneeded track_stack() calls from the functions which don't
+ * call alloca and have the stack frame size less than track_frame_size.
+ */
+static unsigned int stackleak_final_execute(void)
+{
+	rtx_insn *insn, *next;
+
+	if (cfun->calls_alloca)
+		return 0;
+
+	if (get_frame_size() >= track_frame_size)
+		return 0;
+
+	/*
+	 * 1. Find track_stack() calls. Loop through the chain of insns,
+	 * which is an RTL representation of the code for a function.
+	 *
+	 * The example of a matching insn:
+	 *    (call_insn 8 4 10 2 (call (mem (symbol_ref ("track_stack")
+	 *    [flags 0x41] <function_decl 0x7f7cd3302a80 track_stack>)
+	 *    [0 track_stack S1 A8]) (0)) 675 {*call} (expr_list
+	 *    (symbol_ref ("track_stack") [flags 0x41] <function_decl
+	 *    0x7f7cd3302a80 track_stack>) (expr_list (0) (nil))) (nil))
+	 */
+	for (insn = get_insns(); insn; insn = next) {
+		rtx body;
+
+		next = NEXT_INSN(insn);
+
+		/* Check the expression code of the insn */
+		if (!CALL_P(insn))
+			continue;
+
+		/*
+		 * Check the expression code of the insn body, which is an RTL
+		 * Expression (RTX) describing the side effect performed by
+		 * that insn.
+		 */
+		body = PATTERN(insn);
+		if (GET_CODE(body) != CALL)
+			continue;
+
+		/*
+		 * Check the first operand of the call expression. It should
+		 * be a mem RTX describing the needed subroutine with a
+		 * symbol_ref RTX.
+		 */
+		body = XEXP(body, 0);
+		if (GET_CODE(body) != MEM)
+			continue;
+
+		body = XEXP(body, 0);
+		if (GET_CODE(body) != SYMBOL_REF)
+			continue;
+
+		if (SYMBOL_REF_DECL(body) != track_function_decl)
+			continue;
+
+		/* 2. Delete the track_stack() call */
+		delete_insn_and_edges(insn);
+#if BUILDING_GCC_VERSION >= 4007
+		if (GET_CODE(next) == NOTE &&
+		    NOTE_KIND(next) == NOTE_INSN_CALL_ARG_LOCATION) {
+			insn = next;
+			next = NEXT_INSN(insn);
+			delete_insn_and_edges(insn);
+		}
+#endif
+	}
+
+	/*
+	 * Uncomment the following to see the code which was cleaned at this
+	 * pass. It should not contain check_alloca() and track_stack() calls.
+	 * The stack frame size should be less than track_frame_size.
+	 *
+	 * warning(0, "Instrumentation is removed, stack frame size: %ld",
+	 * 						get_frame_size());
+	 * print_simple_rtl(stderr, get_insns());
+	 */
+
+	return 0;
+}
+
+static bool stackleak_track_stack_gate(void)
+{
+	tree section;
+
+	section = lookup_attribute("section",
+				   DECL_ATTRIBUTES(current_function_decl));
+	if (section && TREE_VALUE(section)) {
+		section = TREE_VALUE(TREE_VALUE(section));
+
+		if (!strncmp(TREE_STRING_POINTER(section), ".init.text", 10))
+			return false;
+		if (!strncmp(TREE_STRING_POINTER(section), ".devinit.text", 13))
+			return false;
+		if (!strncmp(TREE_STRING_POINTER(section), ".cpuinit.text", 13))
+			return false;
+		if (!strncmp(TREE_STRING_POINTER(section), ".meminit.text", 13))
+			return false;
+	}
+
+	return track_frame_size >= 0;
+}
+
+/* Build function declarations for track_stack() and check_alloca() */
+static void stackleak_start_unit(void *gcc_data __unused,
+				 void *user_data __unused)
+{
+	tree fntype;
+
+	/* void track_stack(void) */
+	fntype = build_function_type_list(void_type_node, NULL_TREE);
+	track_function_decl = build_fn_decl(track_function, fntype);
+	DECL_ASSEMBLER_NAME(track_function_decl); /* for LTO */
+	TREE_PUBLIC(track_function_decl) = 1;
+	TREE_USED(track_function_decl) = 1;
+	DECL_EXTERNAL(track_function_decl) = 1;
+	DECL_ARTIFICIAL(track_function_decl) = 1;
+	DECL_PRESERVE_P(track_function_decl) = 1;
+
+	/* void check_alloca(unsigned long) */
+	fntype = build_function_type_list(void_type_node,
+				long_unsigned_type_node, NULL_TREE);
+	check_function_decl = build_fn_decl(check_function, fntype);
+	DECL_ASSEMBLER_NAME(check_function_decl); /* for LTO */
+	TREE_PUBLIC(check_function_decl) = 1;
+	TREE_USED(check_function_decl) = 1;
+	DECL_EXTERNAL(check_function_decl) = 1;
+	DECL_ARTIFICIAL(check_function_decl) = 1;
+	DECL_PRESERVE_P(check_function_decl) = 1;
+}
+
+/*
+ * Pass gate function is a predicate function that gets executed before the
+ * corresponding pass. If the return value is 'true' the pass gets executed,
+ * otherwise, it is skipped.
+ */
+static bool stackleak_tree_instrument_gate(void)
+{
+	return stackleak_track_stack_gate();
+}
+
+#define PASS_NAME stackleak_tree_instrument
+#define PROPERTIES_REQUIRED PROP_gimple_leh | PROP_cfg
+#define TODO_FLAGS_START TODO_verify_ssa | TODO_verify_flow | TODO_verify_stmts
+#define TODO_FLAGS_FINISH TODO_verify_ssa | TODO_verify_stmts | TODO_dump_func \
+			| TODO_update_ssa | TODO_rebuild_cgraph_edges
+#include "gcc-generate-gimple-pass.h"
+
+static bool stackleak_final_gate(void)
+{
+	return stackleak_track_stack_gate();
+}
+
+#define PASS_NAME stackleak_final
+#define TODO_FLAGS_FINISH TODO_dump_func
+#include "gcc-generate-rtl-pass.h"
+
+/*
+ * Every gcc plugin exports a plugin_init() function that is called right
+ * after the plugin is loaded. This function is responsible for registering
+ * the plugin callbacks and doing other required initialization.
+ */
+__visible int plugin_init(struct plugin_name_args *plugin_info,
+			  struct plugin_gcc_version *version)
+{
+	const char * const plugin_name = plugin_info->base_name;
+	const int argc = plugin_info->argc;
+	const struct plugin_argument * const argv = plugin_info->argv;
+	int i;
+
+	/* Extra GGC root tables describing our GTY-ed data */
+	static const struct ggc_root_tab gt_ggc_r_gt_stackleak[] = {
+		{
+			.base = &track_function_decl,
+			.nelt = 1,
+			.stride = sizeof(track_function_decl),
+			.cb = &gt_ggc_mx_tree_node,
+			.pchw = &gt_pch_nx_tree_node
+		},
+		{
+			.base = &check_function_decl,
+			.nelt = 1,
+			.stride = sizeof(check_function_decl),
+			.cb = &gt_ggc_mx_tree_node,
+			.pchw = &gt_pch_nx_tree_node
+		},
+		LAST_GGC_ROOT_TAB
+	};
+
+	/*
+	 * The stackleak_tree_instrument pass should be executed before the
+	 * "optimized" pass, which is the control flow graph cleanup that is
+	 * performed just before expanding gcc trees to the RTL. In former
+	 * versions of the plugin this new pass was inserted before the
+	 * "tree_profile" pass, which is currently called "profile".
+	 */
+	PASS_INFO(stackleak_tree_instrument, "optimized", 1,
+						PASS_POS_INSERT_BEFORE);
+
+	/*
+	 * The stackleak_final pass should be executed before the "final" pass,
+	 * which turns the RTL (Register Transfer Language) into assembly.
+	 */
+	PASS_INFO(stackleak_final, "final", 1, PASS_POS_INSERT_BEFORE);
+
+	if (!plugin_default_version_check(version, &gcc_version)) {
+		error(G_("incompatible gcc/plugin versions"));
+		return 1;
+	}
+
+	/* Parse the plugin arguments */
+	if (argc != 1) {
+		error(G_("bad number of the plugin arguments: %d"), argc);
+		return 1;
+	}
+
+	if (strcmp(argv[i].key, "track-min-size")) {
+		error(G_("unknown option '-fplugin-arg-%s-%s'"),
+				plugin_name, argv[i].key);
+		return 1;
+	}
+
+	if (!argv[i].value) {
+		error(G_("no value supplied for option '-fplugin-arg-%s-%s'"),
+				plugin_name, argv[i].key);
+		return 1;
+	}
+
+	track_frame_size = atoi(argv[i].value);
+	if (track_frame_size < 0) {
+		error(G_("invalid option argument '-fplugin-arg-%s-%s=%s'"),
+				plugin_name, argv[i].key, argv[i].value);
+		return 1;
+	}
+
+	/* Give the information about the plugin */
+	register_callback(plugin_name, PLUGIN_INFO, NULL,
+						&stackleak_plugin_info);
+
+	/* Register to be called before processing a translation unit */
+	register_callback(plugin_name, PLUGIN_START_UNIT,
+					&stackleak_start_unit, NULL);
+
+	/* Register an extra GCC garbage collector (GGC) root table */
+	register_callback(plugin_name, PLUGIN_REGISTER_GGC_ROOTS, NULL,
+					(void *)&gt_ggc_r_gt_stackleak);
+
+	/*
+	 * Hook into the Pass Manager to register new gcc passes.
+	 *
+	 * The stack frame size info is available only at the last RTL pass,
+	 * when it's too late to insert complex code like a function call.
+	 * So we register two gcc passes to instrument every function at first
+	 * and remove the unneeded instrumentation later.
+	 */
+	register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL,
+					&stackleak_tree_instrument_pass_info);
+	register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL,
+					&stackleak_final_pass_info);
+
+	return 0;
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [kernel-hardening] [PATCH RFC v5 3/5] lkdtm: Add a test for STACKLEAK
  2017-10-22  0:22 [kernel-hardening] [PATCH RFC v5 0/5] Introduce the STACKLEAK feature and a test for it Alexander Popov
  2017-10-22  0:22 ` [kernel-hardening] [PATCH RFC v5 1/5] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls Alexander Popov
  2017-10-22  0:22 ` [kernel-hardening] [PATCH RFC v5 2/5] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack Alexander Popov
@ 2017-10-22  0:22 ` Alexander Popov
  2017-10-22  0:22 ` [kernel-hardening] [PATCH RFC v5 4/5] fs/proc: Show STACKLEAK metrics in the /proc file system Alexander Popov
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 24+ messages in thread
From: Alexander Popov @ 2017-10-22  0:22 UTC (permalink / raw)
  To: kernel-hardening, keescook, pageexec, spender, Ingo Molnar,
	Andy Lutomirski, tycho, Laura Abbott, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, Peter Zijlstra, x86, alex.popov

Introduce two lkdtm tests for the STACKLEAK feature: STACKLEAK_CHECK_ALLOCA
and STACKLEAK_TRACK_STACK. Both of them check that the current task stack
is properly erased (filled with STACKLEAK_POISON).

In addition, STACKLEAK_CHECK_ALLOCA tests that:
 - check_alloca() allows alloca calls which don't exhaust the kernel stack;
 - alloca calls which exhaust/overflow the kernel stack hit BUG() in
    check_alloca().

And STACKLEAK_TRACK_STACK checks that exhausting the thread stack with a
recursion is detected:
 - it hits the BUG() in track_stack() if CONFIG_VMAP_STACK is not enabled,
 - or hits the guard page otherwise.

Signed-off-by: Tycho Andersen <tycho@docker.com>
Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 drivers/misc/Makefile          |   3 +
 drivers/misc/lkdtm.h           |   4 ++
 drivers/misc/lkdtm_core.c      |   2 +
 drivers/misc/lkdtm_stackleak.c | 139 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 148 insertions(+)
 create mode 100644 drivers/misc/lkdtm_stackleak.c

diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index d84819d..105e16e 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -63,6 +63,9 @@ lkdtm-$(CONFIG_LKDTM)		+= lkdtm_perms.o
 lkdtm-$(CONFIG_LKDTM)		+= lkdtm_refcount.o
 lkdtm-$(CONFIG_LKDTM)		+= lkdtm_rodata_objcopy.o
 lkdtm-$(CONFIG_LKDTM)		+= lkdtm_usercopy.o
+lkdtm-$(CONFIG_LKDTM)		+= lkdtm_stackleak.o
+
+KASAN_SANITIZE_lkdtm_stackleak.o := n
 
 KCOV_INSTRUMENT_lkdtm_rodata.o	:= n
 
diff --git a/drivers/misc/lkdtm.h b/drivers/misc/lkdtm.h
index bfb6c45..5952339 100644
--- a/drivers/misc/lkdtm.h
+++ b/drivers/misc/lkdtm.h
@@ -82,4 +82,8 @@ void lkdtm_USERCOPY_STACK_FRAME_FROM(void);
 void lkdtm_USERCOPY_STACK_BEYOND(void);
 void lkdtm_USERCOPY_KERNEL(void);
 
+/* lkdtm_stackleak.c */
+void lkdtm_STACKLEAK_CHECK_ALLOCA(void);
+void lkdtm_STACKLEAK_TRACK_STACK(void);
+
 #endif
diff --git a/drivers/misc/lkdtm_core.c b/drivers/misc/lkdtm_core.c
index 981b3ef..264d948 100644
--- a/drivers/misc/lkdtm_core.c
+++ b/drivers/misc/lkdtm_core.c
@@ -251,6 +251,8 @@ struct crashtype crashtypes[] = {
 	CRASHTYPE(USERCOPY_STACK_FRAME_FROM),
 	CRASHTYPE(USERCOPY_STACK_BEYOND),
 	CRASHTYPE(USERCOPY_KERNEL),
+	CRASHTYPE(STACKLEAK_CHECK_ALLOCA),
+	CRASHTYPE(STACKLEAK_TRACK_STACK),
 };
 
 
diff --git a/drivers/misc/lkdtm_stackleak.c b/drivers/misc/lkdtm_stackleak.c
new file mode 100644
index 0000000..1f95ee3
--- /dev/null
+++ b/drivers/misc/lkdtm_stackleak.c
@@ -0,0 +1,139 @@
+/*
+ * This file tests a few aspects of the STACKLEAK feature:
+ *   - the current task stack is properly erased (filled with STACKLEAK_POISON);
+ *   - check_alloca() allows alloca calls which don't exhaust the kernel stack;
+ *   - alloca calls which exhaust/overflow the kernel stack hit BUG() in
+ *    check_alloca();
+ *   - exhausting the thread stack with a recursion is detected: it hits the
+ *    BUG() in track_stack() if CONFIG_VMAP_STACK is not enabled, or hits the
+ *    guard page otherwise.
+ *
+ * Copyright (C) Docker, Inc. 2017
+ *
+ * Authors:
+ *   Tycho Andersen <tycho@docker.com>
+ *   Alexander Popov <alex.popov@linux.com>
+ */
+
+#include "lkdtm.h"
+#include <linux/sched.h>
+#include <linux/compiler.h>
+
+#ifndef CONFIG_GCC_PLUGIN_STACKLEAK
+# define STACKLEAK_POISON -0xBEEF
+# define CONFIG_STACKLEAK_TRACK_MIN_SIZE 100
+#endif
+
+static noinline bool stack_is_erased(void)
+{
+	unsigned long *sp, left, found, i;
+
+	/*
+	 * For the details about the alignment of the poison values, see
+	 * the comment in track_stack().
+	 */
+	sp = PTR_ALIGN(&i, sizeof(unsigned long));
+
+	left = ((unsigned long)sp & (THREAD_SIZE - 1)) / sizeof(unsigned long);
+	sp--;
+
+	/*
+	 * Two unsigned long ints at the bottom of the thread stack are
+	 * reserved and not poisoned.
+	 */
+	if (left <= 2)
+		return false;
+
+	left -= 2;
+	pr_info("checking unused part of the thread stack (%lu bytes)...\n",
+					left * sizeof(unsigned long));
+
+	/* Search for 17 poison values in a row (like erase_kstack() does) */
+	for (i = 0, found = 0; i < left && found < 17; i++) {
+		if (*(sp - i) == STACKLEAK_POISON)
+			found++;
+		else
+			found = 0;
+	}
+
+	if (found < 17) {
+		pr_err("FAIL: thread stack is not erased (checked %lu bytes)\n",
+						i * sizeof(unsigned long));
+		return false;
+	}
+
+	pr_info("first %lu bytes are unpoisoned\n",
+				(i - found) * sizeof(unsigned long));
+
+	/* The rest of thread stack should be erased */
+	for (; i < left; i++) {
+		if (*(sp - i) != STACKLEAK_POISON) {
+			pr_err("FAIL: thread stack is NOT properly erased\n");
+			return false;
+		}
+	}
+
+	pr_info("the rest of the thread stack is properly erased\n");
+	return true;
+}
+
+static noinline void do_alloca(unsigned long size)
+{
+	char buf[size];
+
+	/* So this doesn't get inlined or optimized out */
+	snprintf(buf, size, "hello world\n");
+}
+
+void lkdtm_STACKLEAK_CHECK_ALLOCA(void)
+{
+	unsigned long left = (unsigned long)&left & (THREAD_SIZE - 1);
+
+	if (!stack_is_erased())
+		return;
+
+	/* Try a small alloca to see if it works */
+	pr_info("try a small alloca of 16 bytes...\n");
+	do_alloca(16);
+	pr_info("small alloca is successful\n");
+
+	/* Try to hit the BUG() in check_alloca() */
+	pr_info("try a large alloca of %lu bytes (stack overflow)...\n", left);
+	do_alloca(left);
+	pr_err("FAIL: large alloca overstepped the thread stack boundary\n");
+}
+
+/*
+ * The stack frame size of recursion() is bigger than the
+ * CONFIG_STACKLEAK_TRACK_MIN_SIZE, hence that function is instrumented
+ * by the STACKLEAK gcc plugin and it calls track_stack() at the beginning.
+ */
+static noinline unsigned long recursion(unsigned long prev_sp)
+{
+	char buf[CONFIG_STACKLEAK_TRACK_MIN_SIZE + 42];
+	unsigned long sp = (unsigned long)&sp;
+
+	snprintf(buf, sizeof(buf), "hello world\n");
+
+	if (prev_sp < sp + THREAD_SIZE)
+		sp = recursion(prev_sp);
+
+	return sp;
+}
+
+void lkdtm_STACKLEAK_TRACK_STACK(void)
+{
+	unsigned long sp = (unsigned long)&sp;
+
+	if (!stack_is_erased())
+		return;
+
+	/*
+	 * Exhaust the thread stack with a recursion. It should hit the
+	 * BUG() in track_stack() if CONFIG_VMAP_STACK is not enabled, or
+	 * access the guard page otherwise.
+	 */
+	pr_info("try to exhaust the thread stack with the recursion...\n");
+	pr_err("FAIL: thread stack exhaustion (%lu bytes) is not detected\n",
+							sp - recursion(sp));
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [kernel-hardening] [PATCH RFC v5 4/5] fs/proc: Show STACKLEAK metrics in the /proc file system
  2017-10-22  0:22 [kernel-hardening] [PATCH RFC v5 0/5] Introduce the STACKLEAK feature and a test for it Alexander Popov
                   ` (2 preceding siblings ...)
  2017-10-22  0:22 ` [kernel-hardening] [PATCH RFC v5 3/5] lkdtm: Add a test for STACKLEAK Alexander Popov
@ 2017-10-22  0:22 ` Alexander Popov
  2017-10-22  0:22 ` [kernel-hardening] [PATCH RFC v5 5/5] doc: self-protection: Add information about STACKLEAK feature Alexander Popov
  2017-10-22 13:11 ` [kernel-hardening] Re: [PATCH RFC v5 0/5] Introduce the STACKLEAK feature and a test for it Peter Zijlstra
  5 siblings, 0 replies; 24+ messages in thread
From: Alexander Popov @ 2017-10-22  0:22 UTC (permalink / raw)
  To: kernel-hardening, keescook, pageexec, spender, Ingo Molnar,
	Andy Lutomirski, tycho, Laura Abbott, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, Peter Zijlstra, x86, alex.popov

Introduce CONFIG_STACKLEAK_METRICS providing STACKLEAK information about
tasks via the /proc file system. In particular, /proc/<pid>/lowest_stack
shows the current lowest_stack value and its final value from the previous
syscall. That information can be useful for estimating the STACKLEAK
performance impact for different workloads.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 arch/Kconfig                     | 11 +++++++++++
 arch/x86/entry/entry_32.S        |  4 ++++
 arch/x86/entry/entry_64.S        |  4 ++++
 arch/x86/include/asm/processor.h |  3 +++
 arch/x86/kernel/asm-offsets.c    |  3 +++
 arch/x86/kernel/process_32.c     |  3 +++
 arch/x86/kernel/process_64.c     |  3 +++
 fs/proc/base.c                   | 14 ++++++++++++++
 8 files changed, 45 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index f2de598..c48d828 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -555,6 +555,17 @@ config STACKLEAK_TRACK_MIN_SIZE
 	  frame size greater than or equal to this parameter. If unsure,
 	  leave the default value 100.
 
+config STACKLEAK_METRICS
+	bool "Show STACKLEAK metrics in the /proc file system"
+	depends on GCC_PLUGIN_STACKLEAK
+	depends on PROC_FS
+	help
+	  If this is set, STACKLEAK metrics for every task are available in
+	  the /proc file system. In particular, /proc/<pid>/lowest_stack
+	  shows the current lowest_stack value and its final value from the
+	  previous syscall. That information can be useful for estimating
+	  the STACKLEAK performance impact for your workloads.
+
 config HAVE_CC_STACKPROTECTOR
 	bool
 	help
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index a7b0c52..4f3f2ff 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -115,6 +115,10 @@ ENTRY(erase_kstack)
 	mov	%esp, %ecx
 	sub	%edi, %ecx
 
+#ifdef CONFIG_STACKLEAK_METRICS
+	mov	%edi, TASK_prev_lowest_stack(%ebp)
+#endif
+
 	cmp	$THREAD_SIZE_asm, %ecx
 	jb	3f
 	ud2
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 189d843..fbf7f1c 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -116,6 +116,10 @@ ENTRY(erase_kstack)
 	mov	%esp, %ecx
 	sub	%edi, %ecx
 
+#ifdef CONFIG_STACKLEAK_METRICS
+	mov	%rdi, TASK_prev_lowest_stack(%r11)
+#endif
+
 	/* Check that the counter value is sane. */
 	cmp	$THREAD_SIZE_asm, %rcx
 	jb	3f
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index c6eaf2d..8e3f2ef 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -479,6 +479,9 @@ struct thread_struct {
 
 #ifdef CONFIG_GCC_PLUGIN_STACKLEAK
 	unsigned long		lowest_stack;
+# ifdef CONFIG_STACKLEAK_METRICS
+	unsigned long		prev_lowest_stack;
+# endif
 #endif
 
 	unsigned int		sig_on_uaccess_err:1;
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index 4ed7451..673495d 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -40,6 +40,9 @@ void common(void) {
 #ifdef CONFIG_GCC_PLUGIN_STACKLEAK
 	OFFSET(TASK_lowest_stack, task_struct, thread.lowest_stack);
 	OFFSET(TASK_thread_sp0, task_struct, thread.sp0);
+# ifdef CONFIG_STACKLEAK_METRICS
+	OFFSET(TASK_prev_lowest_stack, task_struct, thread.prev_lowest_stack);
+# endif
 #endif
 
 	BLANK();
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index c7345d2..0e4fa3c 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -139,6 +139,9 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
 #ifdef CONFIG_GCC_PLUGIN_STACKLEAK
 	p->thread.lowest_stack = (unsigned long)task_stack_page(p) +
 						2 * sizeof(unsigned long);
+# ifdef CONFIG_STACKLEAK_METRICS
+	p->thread.prev_lowest_stack = p->thread.lowest_stack;
+# endif
 #endif
 
 	if (unlikely(p->flags & PF_KTHREAD)) {
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 65ba73f..50d019c 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -286,6 +286,9 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
 #ifdef CONFIG_GCC_PLUGIN_STACKLEAK
 	p->thread.lowest_stack = (unsigned long)task_stack_page(p) +
 						2 * sizeof(unsigned long);
+# ifdef CONFIG_STACKLEAK_METRICS
+	p->thread.prev_lowest_stack = p->thread.lowest_stack;
+# endif
 #endif
 
 	savesegment(gs, p->thread.gsindex);
diff --git a/fs/proc/base.c b/fs/proc/base.c
index ad3b076..7c3e127 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2884,6 +2884,17 @@ static int proc_pid_patch_state(struct seq_file *m, struct pid_namespace *ns,
 }
 #endif /* CONFIG_LIVEPATCH */
 
+#ifdef CONFIG_STACKLEAK_METRICS
+static int proc_lowest_stack(struct seq_file *m, struct pid_namespace *ns,
+				struct pid *pid, struct task_struct *task)
+{
+	seq_printf(m, "prev_lowest_stack: %pK\nlowest_stack: %pK\n",
+		   (void *)task->thread.prev_lowest_stack,
+		   (void *)task->thread.lowest_stack);
+	return 0;
+}
+#endif /* CONFIG_STACKLEAK_METRICS */
+
 /*
  * Thread groups
  */
@@ -2988,6 +2999,9 @@ static const struct pid_entry tgid_base_stuff[] = {
 #ifdef CONFIG_LIVEPATCH
 	ONE("patch_state",  S_IRUSR, proc_pid_patch_state),
 #endif
+#ifdef CONFIG_STACKLEAK_METRICS
+	ONE("lowest_stack", S_IRUGO, proc_lowest_stack),
+#endif
 };
 
 static int proc_tgid_base_readdir(struct file *file, struct dir_context *ctx)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [kernel-hardening] [PATCH RFC v5 5/5] doc: self-protection: Add information about STACKLEAK feature
  2017-10-22  0:22 [kernel-hardening] [PATCH RFC v5 0/5] Introduce the STACKLEAK feature and a test for it Alexander Popov
                   ` (3 preceding siblings ...)
  2017-10-22  0:22 ` [kernel-hardening] [PATCH RFC v5 4/5] fs/proc: Show STACKLEAK metrics in the /proc file system Alexander Popov
@ 2017-10-22  0:22 ` Alexander Popov
  2017-10-22 13:11 ` [kernel-hardening] Re: [PATCH RFC v5 0/5] Introduce the STACKLEAK feature and a test for it Peter Zijlstra
  5 siblings, 0 replies; 24+ messages in thread
From: Alexander Popov @ 2017-10-22  0:22 UTC (permalink / raw)
  To: kernel-hardening, keescook, pageexec, spender, Ingo Molnar,
	Andy Lutomirski, tycho, Laura Abbott, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, Peter Zijlstra, x86, alex.popov

Add information about STACKLEAK feature to "Stack depth overflow" and
"Memory poisoning" sections of self-protection.rst.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 Documentation/security/self-protection.rst | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/Documentation/security/self-protection.rst b/Documentation/security/self-protection.rst
index 60c8bd8..9693a90 100644
--- a/Documentation/security/self-protection.rst
+++ b/Documentation/security/self-protection.rst
@@ -165,10 +165,15 @@ Stack depth overflow
 A less well understood attack is using a bug that triggers the
 kernel to consume stack memory with deep function calls or large stack
 allocations. With this attack it is possible to write beyond the end of
-the kernel's preallocated stack space and into sensitive structures. Two
-important changes need to be made for better protections: moving the
-sensitive thread_info structure elsewhere, and adding a faulting memory
-hole at the bottom of the stack to catch these overflows.
+the kernel's preallocated stack space and into sensitive structures.
+The combination of the following measures gives better protection:
+
+* moving the sensitive thread_info structure off the stack
+  (``CONFIG_THREAD_INFO_IN_TASK``);
+* adding a faulting memory hole at the bottom of the stack to catch
+  these overflows (``CONFIG_VMAP_STACK``);
+* runtime checking that alloca() calls don't overstep the stack boundary
+  (``CONFIG_GCC_PLUGIN_STACKLEAK``).
 
 Heap memory integrity
 ---------------------
@@ -287,11 +292,11 @@ sure structure holes are cleared.
 Memory poisoning
 ----------------
 
-When releasing memory, it is best to poison the contents (clear stack on
-syscall return, wipe heap memory on a free), to avoid reuse attacks that
-rely on the old contents of memory. This frustrates many uninitialized
-variable attacks, stack content exposures, heap content exposures, and
-use-after-free attacks.
+When releasing memory, it is best to poison the contents, to avoid reuse
+attacks that rely on the old contents of memory. E.g., clear stack on a
+syscall return (``CONFIG_GCC_PLUGIN_STACKLEAK``), wipe heap memory on a
+free. This frustrates many uninitialized variable attacks, stack content
+exposures, heap content exposures, and use-after-free attacks.
 
 Destination tracking
 --------------------
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v5 0/5] Introduce the STACKLEAK feature and a test for it
  2017-10-22  0:22 [kernel-hardening] [PATCH RFC v5 0/5] Introduce the STACKLEAK feature and a test for it Alexander Popov
                   ` (4 preceding siblings ...)
  2017-10-22  0:22 ` [kernel-hardening] [PATCH RFC v5 5/5] doc: self-protection: Add information about STACKLEAK feature Alexander Popov
@ 2017-10-22 13:11 ` Peter Zijlstra
  2017-10-23  9:08   ` Mark Rutland
  2017-10-23 11:21   ` Alexander Popov
  5 siblings, 2 replies; 24+ messages in thread
From: Peter Zijlstra @ 2017-10-22 13:11 UTC (permalink / raw)
  To: Alexander Popov
  Cc: kernel-hardening, keescook, pageexec, spender, Ingo Molnar,
	Andy Lutomirski, tycho, Laura Abbott, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, x86

On Sun, Oct 22, 2017 at 03:22:48AM +0300, Alexander Popov wrote:
 
> These numbers were obtained for the 4th version of the patch series.
> 
> Size of vmlinux (x86_64_defconfig):
>  file size:
>   - STACKLEAK disabled: 35014784 bytes
>   - STACKLEAK enabled: 35044952 bytes (+0.086%)
>  .text section size (calculated by size utility):
>   - STACKLEAK disabled: 10752983
>   - STACKLEAK enabled: 11062221 (+2.876%)

Why no runtime costs? This should not be hard to measure.

> Further work
> =============
> 
>  - Rewrite erase_kstack() in C (if Ingo Molnar insists).

Aside from legacy, is there any sane reason that stuff is in ASM? That
is, I too will insist it being in C unless you can provide good
arguments on why it needs be asm. All it does is memzero() right? And I
would think architectures already have fairly optimized implementations
of that around.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v5 0/5] Introduce the STACKLEAK feature and a test for it
  2017-10-22 13:11 ` [kernel-hardening] Re: [PATCH RFC v5 0/5] Introduce the STACKLEAK feature and a test for it Peter Zijlstra
@ 2017-10-23  9:08   ` Mark Rutland
  2017-10-23 12:11     ` Alexander Popov
  2017-10-23 11:21   ` Alexander Popov
  1 sibling, 1 reply; 24+ messages in thread
From: Mark Rutland @ 2017-10-23  9:08 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Alexander Popov, kernel-hardening, keescook, pageexec, spender,
	Ingo Molnar, Andy Lutomirski, tycho, Laura Abbott,
	Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, x86

On Sun, Oct 22, 2017 at 03:11:57PM +0200, Peter Zijlstra wrote:
> On Sun, Oct 22, 2017 at 03:22:48AM +0300, Alexander Popov wrote:
> > Further work
> > =============
> > 
> >  - Rewrite erase_kstack() in C (if Ingo Molnar insists).
> 
> Aside from legacy, is there any sane reason that stuff is in ASM? That
> is, I too will insist it being in C unless you can provide good
> arguments on why it needs be asm. All it does is memzero() right? And I
> would think architectures already have fairly optimized implementations
> of that around.

One argument for this being in asm is that you can't control the stack
usage of a C implementation of erase_kstack(), which leaves a (small)
portion of the stack potentially not cleared depending on what exactly
the compiler does with the stack.

If it's just a call to memzero(), the asm should be very simple, and I'd
think it's better to err on the side of clearing as much of the stack as
possible.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v5 0/5] Introduce the STACKLEAK feature and a test for it
  2017-10-22 13:11 ` [kernel-hardening] Re: [PATCH RFC v5 0/5] Introduce the STACKLEAK feature and a test for it Peter Zijlstra
  2017-10-23  9:08   ` Mark Rutland
@ 2017-10-23 11:21   ` Alexander Popov
  1 sibling, 0 replies; 24+ messages in thread
From: Alexander Popov @ 2017-10-23 11:21 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: kernel-hardening, keescook, pageexec, spender, Ingo Molnar,
	Andy Lutomirski, tycho, Laura Abbott, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, x86

On 22.10.2017 16:11, Peter Zijlstra wrote:
> On Sun, Oct 22, 2017 at 03:22:48AM +0300, Alexander Popov wrote:
>  
>> These numbers were obtained for the 4th version of the patch series.
>>
>> Size of vmlinux (x86_64_defconfig):
>>  file size:
>>   - STACKLEAK disabled: 35014784 bytes
>>   - STACKLEAK enabled: 35044952 bytes (+0.086%)
>>  .text section size (calculated by size utility):
>>   - STACKLEAK disabled: 10752983
>>   - STACKLEAK enabled: 11062221 (+2.876%)
> 
> Why no runtime costs? This should not be hard to measure.

Hello Peter,

Here are the results of a brief performance test on x86_64 (for the 2nd version
of the patch). The numbers are very different because the STACKLEAK performance
penalty depends on the workloads.

Hardware: Intel Core i7-4770, 16 GB RAM

Test #1: hackbench -s 4096 -l 2000 -g 15 -f 25 -P
Average result on 4.11-rc8: 8.71s
Average result on 4.11-rc8+stackleak: 9.08s (+4.29%)

Test #2: building the Linux kernel with Ubuntu config (time make -j9)
Result on 4.11-rc8:
  real	32m14.893s
  user	237m30.886s
  sys	11m12.273s
Result on 4.11-rc8+stackleak:
  real	32m26.881s (+0.62%)
  user	238m38.926s (+0.48%)
  sys	11m36.426s (+3.59%)

Test #3: stress-ng --cpu 4 --io 2 --vm 2 --vm-bytes 2G --timeout 300s
 --metrics-brief
Result on 4.11-rc8:
  cpu bogo ops 269955
  iosync bogo ops 9809985
  vm bogo ops 17093608
Result on 4.11-rc8+stackleak:
  cpu bogo ops 270106 (+0.6%)
  iosync bogo ops 9474535 (-3.42%)
  vm bogo ops 17093608 (the same)

So the STACKLEAK description in Kconfig includes:
"The tradeoff is the performance impact: on a single CPU system kernel
compilation sees a 1% slowdown, other systems and workloads may vary and you are
advised to test this feature on your expected workload before deploying it".

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v5 0/5] Introduce the STACKLEAK feature and a test for it
  2017-10-23  9:08   ` Mark Rutland
@ 2017-10-23 12:11     ` Alexander Popov
  0 siblings, 0 replies; 24+ messages in thread
From: Alexander Popov @ 2017-10-23 12:11 UTC (permalink / raw)
  To: Mark Rutland, Peter Zijlstra
  Cc: kernel-hardening, keescook, pageexec, spender, Ingo Molnar,
	Andy Lutomirski, tycho, Laura Abbott, Ard Biesheuvel,
	Borislav Petkov, Thomas Gleixner, H . Peter Anvin, x86

Hello Peter and Mark,

On 23.10.2017 12:08, Mark Rutland wrote:
> On Sun, Oct 22, 2017 at 03:11:57PM +0200, Peter Zijlstra wrote:
>> On Sun, Oct 22, 2017 at 03:22:48AM +0300, Alexander Popov wrote:
>>> Further work
>>> =============
>>>
>>>  - Rewrite erase_kstack() in C (if Ingo Molnar insists).
>>
>> Aside from legacy, is there any sane reason that stuff is in ASM? That
>> is, I too will insist it being in C unless you can provide good
>> arguments on why it needs be asm. All it does is memzero() right? And I
>> would think architectures already have fairly optimized implementations
>> of that around.

I added very detailed comments describing erase_kstack() on x86_64. It is quite
tricky for being fast.

It starts from the lowest_stack address and goes down to the bottom of the stack
searching for the poison value (-0xBEEF). If it reaches the stack bottom or
finds 17 poison values in a row, it starts to fill the memory between the
current position and the stack pointer with a poison value. Finally it resets
the lowest_stack value. So only used part of the kernel stack is actually erased.

I really like the PaX Team's code in erase_kstack().

> One argument for this being in asm is that you can't control the stack
> usage of a C implementation of erase_kstack(), which leaves a (small)
> portion of the stack potentially not cleared depending on what exactly
> the compiler does with the stack.
> 
> If it's just a call to memzero(), the asm should be very simple, and I'd
> think it's better to err on the side of clearing as much of the stack as
> possible.

Laura Abbott wrote about her issues with erase_kstack() and gcc in the previous
thread: http://www.openwall.com/lists/kernel-hardening/2017/10/10/3

Anyway, I'll obey the final decision.

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v5 1/5] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2017-10-22  0:22 ` [kernel-hardening] [PATCH RFC v5 1/5] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls Alexander Popov
@ 2017-10-23 13:17   ` Tycho Andersen
  2017-10-24 21:30     ` Alexander Popov
  0 siblings, 1 reply; 24+ messages in thread
From: Tycho Andersen @ 2017-10-23 13:17 UTC (permalink / raw)
  To: Alexander Popov
  Cc: kernel-hardening, keescook, pageexec, spender, Ingo Molnar,
	Andy Lutomirski, Laura Abbott, Mark Rutland, Ard Biesheuvel,
	Borislav Petkov, Thomas Gleixner, H . Peter Anvin,
	Peter Zijlstra, x86

On Sun, Oct 22, 2017 at 03:22:49AM +0300, Alexander Popov wrote:
> The STACKLEAK feature erases the kernel stack before returning from
> syscalls. That reduces the information which kernel stack leak bugs can
> reveal and blocks some uninitialized stack variable attacks. Moreover,
> STACKLEAK provides runtime checks for kernel stack overflow detection.
> 
> This commit introduces the architecture-specific code filling the used
> part of the kernel stack with a poison value before returning to the
> userspace. Full STACKLEAK feature also contains the gcc plugin which
> comes in a separate commit.
> 
> The STACKLEAK feature is ported from grsecurity/PaX. More information at:
>   https://grsecurity.net/
>   https://pax.grsecurity.net/
> 
> This code is modified from Brad Spengler/PaX Team's code in the last
> public patch of grsecurity/PaX based on our understanding of the code.
> Changes or omissions from the original code are ours and don't reflect
> the original grsecurity/PaX code.
> 
> Signed-off-by: Alexander Popov <alex.popov@linux.com>
> ---
>  arch/Kconfig                     | 27 ++++++++++++
>  arch/x86/Kconfig                 |  1 +
>  arch/x86/entry/common.c          | 17 +++++--
>  arch/x86/entry/entry_32.S        | 69 +++++++++++++++++++++++++++++
>  arch/x86/entry/entry_64.S        | 95 ++++++++++++++++++++++++++++++++++++++++
>  arch/x86/entry/entry_64_compat.S |  8 ++++
>  arch/x86/include/asm/processor.h |  4 ++
>  arch/x86/kernel/asm-offsets.c    |  9 ++++
>  arch/x86/kernel/process_32.c     |  5 +++
>  arch/x86/kernel/process_64.c     |  5 +++
>  include/linux/compiler.h         |  4 ++
>  11 files changed, 241 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/Kconfig b/arch/Kconfig
> index d789a89..e9ec94c 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -386,6 +386,13 @@ config SECCOMP_FILTER
>  
>  	  See Documentation/prctl/seccomp_filter.txt for details.
>  
> +config HAVE_ARCH_STACKLEAK
> +	bool
> +	help
> +	  An architecture should select this if it has the code which
> +	  fills the used part of the kernel stack with the STACKLEAK_POISON
> +	  value before returning from system calls.
> +
>  config HAVE_GCC_PLUGINS
>  	bool
>  	help
> @@ -516,6 +523,26 @@ config GCC_PLUGIN_RANDSTRUCT_PERFORMANCE
>  	  in structures.  This reduces the performance hit of RANDSTRUCT
>  	  at the cost of weakened randomization.
>  
> +config GCC_PLUGIN_STACKLEAK
> +	bool "Erase the kernel stack before returning from syscalls"
> +	depends on GCC_PLUGINS
> +	depends on HAVE_ARCH_STACKLEAK
> +	help
> +	  This option makes the kernel erase the kernel stack before it
> +	  returns from a system call. That reduces the information which
> +	  kernel stack leak bugs can reveal and blocks some uninitialized
> +	  stack variable attacks. This option also provides runtime checks
> +	  for kernel stack overflow detection.
> +
> +	  The tradeoff is the performance impact: on a single CPU system kernel
> +	  compilation sees a 1% slowdown, other systems and workloads may vary
> +	  and you are advised to test this feature on your expected workload
> +	  before deploying it.
> +
> +	  This plugin was ported from grsecurity/PaX. More information at:
> +	   * https://grsecurity.net/
> +	   * https://pax.grsecurity.net/
> +
>  config HAVE_CC_STACKPROTECTOR
>  	bool
>  	help
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 971feac..b7da58f 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -114,6 +114,7 @@ config X86
>  	select HAVE_ARCH_MMAP_RND_COMPAT_BITS	if MMU && COMPAT
>  	select HAVE_ARCH_COMPAT_MMAP_BASES	if MMU && COMPAT
>  	select HAVE_ARCH_SECCOMP_FILTER
> +	select HAVE_ARCH_STACKLEAK
>  	select HAVE_ARCH_TRACEHOOK
>  	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
>  	select HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD if X86_64
> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> index 03505ff..075487e 100644
> --- a/arch/x86/entry/common.c
> +++ b/arch/x86/entry/common.c
> @@ -45,6 +45,12 @@ __visible inline void enter_from_user_mode(void)
>  static inline void enter_from_user_mode(void) {}
>  #endif
>  
> +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
> +asmlinkage void erase_kstack(void);
> +#else
> +static void erase_kstack(void) {}
> +#endif
> +
>  static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch)
>  {
>  #ifdef CONFIG_X86_64
> @@ -81,8 +87,10 @@ static long syscall_trace_enter(struct pt_regs *regs)
>  		emulated = true;
>  
>  	if ((emulated || (work & _TIF_SYSCALL_TRACE)) &&
> -	    tracehook_report_syscall_entry(regs))
> +	    tracehook_report_syscall_entry(regs)) {
> +		erase_kstack();
>  		return -1L;
> +	}
>  
>  	if (emulated)
>  		return -1L;
> @@ -116,9 +124,11 @@ static long syscall_trace_enter(struct pt_regs *regs)
>  			sd.args[5] = regs->bp;
>  		}
>  
> -		ret = __secure_computing(&sd);
> -		if (ret == -1)
> +		ret = secure_computing(&sd);

Is there any reason to switch this from the __ version? This basically
adds an additional check on the TIF_SECCOMP flag, but I'm not sure
that's intentional with this patch.

Cheers,

Tycho

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v5 1/5] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2017-10-23 13:17   ` [kernel-hardening] " Tycho Andersen
@ 2017-10-24 21:30     ` Alexander Popov
  2017-10-31 15:20       ` Kees Cook
  0 siblings, 1 reply; 24+ messages in thread
From: Alexander Popov @ 2017-10-24 21:30 UTC (permalink / raw)
  To: Tycho Andersen, keescook, Andy Lutomirski
  Cc: kernel-hardening, pageexec, spender, Ingo Molnar, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, Peter Zijlstra, x86

On 23.10.2017 16:17, Tycho Andersen wrote:
> On Sun, Oct 22, 2017 at 03:22:49AM +0300, Alexander Popov wrote:
>> The STACKLEAK feature erases the kernel stack before returning from
>> syscalls. That reduces the information which kernel stack leak bugs can
>> reveal and blocks some uninitialized stack variable attacks. Moreover,
>> STACKLEAK provides runtime checks for kernel stack overflow detection.
>>
>> This commit introduces the architecture-specific code filling the used
>> part of the kernel stack with a poison value before returning to the
>> userspace. Full STACKLEAK feature also contains the gcc plugin which
>> comes in a separate commit.
>>
>> The STACKLEAK feature is ported from grsecurity/PaX. More information at:
>>   https://grsecurity.net/
>>   https://pax.grsecurity.net/
>>
>> This code is modified from Brad Spengler/PaX Team's code in the last
>> public patch of grsecurity/PaX based on our understanding of the code.
>> Changes or omissions from the original code are ours and don't reflect
>> the original grsecurity/PaX code.
>>
>> Signed-off-by: Alexander Popov <alex.popov@linux.com>
>> ---

[...]

>> @@ -81,8 +87,10 @@ static long syscall_trace_enter(struct pt_regs *regs)
>>  		emulated = true;
>>  
>>  	if ((emulated || (work & _TIF_SYSCALL_TRACE)) &&
>> -	    tracehook_report_syscall_entry(regs))
>> +	    tracehook_report_syscall_entry(regs)) {
>> +		erase_kstack();
>>  		return -1L;
>> +	}
>>  
>>  	if (emulated)
>>  		return -1L;
>> @@ -116,9 +124,11 @@ static long syscall_trace_enter(struct pt_regs *regs)
>>  			sd.args[5] = regs->bp;
>>  		}
>>  
>> -		ret = __secure_computing(&sd);
>> -		if (ret == -1)
>> +		ret = secure_computing(&sd);
> 
> Is there any reason to switch this from the __ version? This basically
> adds an additional check on the TIF_SECCOMP flag, but I'm not sure
> that's intentional with this patch.

Hello Tycho, thanks for your remark!

Initially I took this change from the grsecurity patch, because it looked
reasonable for me at that time. But now I doubt, thank you.

Kees and Andy (Lutomirski), you are the authors of syscall_trace_enter(). Could
you please have a look at this change?

By the way, it seems that one erase_kstack() call is missing in that function.
Could you please have a glance at the places where erase_kstack() is called?

Thanks in advance.
Best regards,
Alexander

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v5 2/5] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
  2017-10-22  0:22 ` [kernel-hardening] [PATCH RFC v5 2/5] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack Alexander Popov
@ 2017-10-30 16:51   ` Alexander Popov
  2017-10-30 17:32     ` Peter Zijlstra
  0 siblings, 1 reply; 24+ messages in thread
From: Alexander Popov @ 2017-10-30 16:51 UTC (permalink / raw)
  To: kernel-hardening, keescook, pageexec, spender, Ingo Molnar,
	Andy Lutomirski, tycho, Laura Abbott, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, Peter Zijlstra, x86

On 22.10.2017 03:22, Alexander Popov wrote:
> The STACKLEAK feature erases the kernel stack before returning from
> syscalls. That reduces the information which kernel stack leak bugs can
> reveal and blocks some uninitialized stack variable attacks. Moreover,
> STACKLEAK provides runtime checks for kernel stack overflow detection.
> 
> This commit introduces the STACKLEAK gcc plugin. It is needed for:
>  - tracking the lowest border of the kernel stack, which is important
>     for the code erasing the used part of the kernel stack at the end
>     of syscalls (comes in a separate commit);
>  - checking that alloca calls don't cause stack overflow.
> 
> So this plugin instruments the kernel code inserting:
>  - the check_alloca() call before alloca and the track_stack() call
>     after it;
>  - the track_stack() call for the functions with a stack frame size
>     greater than or equal to CONFIG_STACKLEAK_TRACK_MIN_SIZE.
> 
> The STACKLEAK feature is ported from grsecurity/PaX. More information at:
>   https://grsecurity.net/
>   https://pax.grsecurity.net/
> 
> This code is modified from Brad Spengler/PaX Team's code in the last
> public patch of grsecurity/PaX based on our understanding of the code.
> Changes or omissions from the original code are ours and don't reflect
> the original grsecurity/PaX code.
> 
> Signed-off-by: Alexander Popov <alex.popov@linux.com>
> ---
>  arch/Kconfig                           |  12 +
>  arch/x86/kernel/dumpstack.c            |  15 ++
>  fs/exec.c                              |  30 +++
>  scripts/Makefile.gcc-plugins           |   3 +
>  scripts/gcc-plugins/stackleak_plugin.c | 470 +++++++++++++++++++++++++++++++++
>  5 files changed, 530 insertions(+)
>  create mode 100644 scripts/gcc-plugins/stackleak_plugin.c
> 

[...]

> diff --git a/fs/exec.c b/fs/exec.c
> index 3e14ba2..481ef4b 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1958,3 +1958,33 @@ COMPAT_SYSCALL_DEFINE5(execveat, int, fd,
>  				  argv, envp, flags);
>  }
>  #endif
> +
> +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
> +void __used track_stack(void)
> +{
> +	/*
> +	 * N.B. The arch-specific part of the STACKLEAK feature fills the
> +	 * kernel stack with the poison value, which has the register width.
> +	 * That code assumes that the value of thread.lowest_stack is aligned
> +	 * on the register width boundary.
> +	 *
> +	 * That is true for x86 and x86_64 because of the kernel stack
> +	 * alignment on these platforms (for details, see cc_stack_align in
> +	 * arch/x86/Makefile). Take care of that when you port STACKLEAK to
> +	 * new platforms.
> +	 */
> +	unsigned long sp = (unsigned long)&sp;
> +
> +	if (sp < current->thread.lowest_stack &&
> +	    sp >= (unsigned long)task_stack_page(current) +
> +					2 * sizeof(unsigned long)) {
> +		current->thread.lowest_stack = sp;
> +	}
> +
> +#ifndef CONFIG_VMAP_STACK
> +	if (unlikely((sp & (THREAD_SIZE - 1)) < (THREAD_SIZE / 16)))
> +		BUG();

Hello, I need your help! I'm trying to solve the problem with the recursive
BUG() here (currently on x86_64).

When the thread stack is exhausted, this BUG() is hit. But do_error_trap(),
which handles the exception, calls track_stack() itself again (since it is
instrumented by the gcc plugin). So this recursion proceeds with exhausting the
thread stack.

Finally the stack pointer oversteps the stack bottom and the bug is not hit
anymore, because (sp & (THREAD_SIZE - 1)) is big again. And we have such an oops
message printed:

[   17.529075] ------------[ cut here ]------------
[   17.529075] kernel BUG at fs/exec.c:1986!
[   17.529075] invalid opcode: 0000 [#1] SMP
[   17.529075] Dumping ftrace buffer:
[   17.529075]    (ftrace buffer empty)
[   17.529075] Modules linked in: lkdtm
[   17.529075] CPU: 0 PID: 2651 Comm: sh Not tainted 4.14.0-rc5+ #10
[   17.529075] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   17.529075] task: ffff880079950040 task.stack: ffff88007a904000
[   17.529075] RIP: 0010:track_stack+0x52/0x60
[   17.529075] RSP: 0018:ffff88007a904078 EFLAGS: 00010193
[   17.529075] RAX: 0000000000000078 RBX: 0000000000000006 RCX: ffff88007a904010
[   17.529075] RDX: ffff880079950040 RSI: ffff88007a904000 RDI: ffff88007a904168
[   17.529075] RBP: ffff88007a904080 R08: 0000000000000004 R09: 000000000000018c
[   17.529075] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88007a904168
[   17.529075] R13: 0000000000000004 R14: ffffffff81bd33c9 R15: 0000000000000000
[   17.529075] FS:  00007f469977e700(0000) GS:ffff88007fc00000(0000)
knlGS:0000000000000000
[   17.529075] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   17.529075] CR2: 00000000021ab618 CR3: 000000007a404000 CR4: 00000000000006f0
[   17.529075] Call Trace:
[   17.529075]  do_error_trap+0x25/0xe0
[   17.529075]  do_invalid_op+0x2a/0x30
[   17.529075]  invalid_op+0x18/0x20
[   17.529075] RIP: 0010:track_stack+0x52/0x60
[   17.529075] RSP: 0018:ffff88007a904218 EFLAGS: 00010193
[   17.529075] RAX: 0000000000000218 RBX: 0000000000000006 RCX: ffff88007a904010
[   17.529075] RDX: ffff880079950040 RSI: ffff88007a904000 RDI: ffff88007a904308
[   17.529075] RBP: ffff88007a904220 R08: 0000000000000004 R09: 000000000000018c
[   17.529075] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88007a904308
[   17.529075] R13: 0000000000000004 R14: ffffffff81bd33c9 R15: 0000000000000000
[   17.529075]  do_error_trap+0x25/0xe0
[   17.529075]  do_invalid_op+0x2a/0x30
[   17.529075]  invalid_op+0x18/0x20
[   17.529075] RIP: 0010:track_stack+0x52/0x60
[   17.529075] RSP: 0018:ffff88007a9043b8 EFLAGS: 00010393
[   17.529075] RAX: 00000000000003b8 RBX: ffff88007a907d90 RCX: ffff88007a904010
[   17.529075] RDX: ffff880079950040 RSI: ffff88007a904000 RDI: ffff88007a907d90
[   17.529075] RBP: ffff88007a9043c0 R08: 0000000000000000 R09: 000000000000018c
[   17.529075] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000000003d0
[   17.529075] R13: ffffffffc0007020 R14: 0000000000000016 R15: 000000000000003d
[   17.529075]  recursion+0x14/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  ? put_dec+0x24/0xb0
[   17.529075]  ? put_dec+0x24/0xb0
[   17.529075]  ? number+0x2c5/0x2e0
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  ? vsnprintf+0xd1/0x4b0
[   17.529075]  ? wait_for_xmitr+0x32/0x90
[   17.529075]  ? serial8250_console_putchar+0x25/0x30
[   17.529075]  ? wait_for_xmitr+0x90/0x90
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  ? up+0x30/0x50
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  ? vprintk_emit+0x24a/0x2d0
[   17.529075]  recursion+0x58/0x70 [lkdtm]
[   17.529075]  ? vprintk_func+0x31/0x80
[   17.529075]  ? printk+0x4a/0x55
[   17.529075]  lkdtm_STACKLEAK_TRACK_STACK+0x39/0x4d [lkdtm]
[   17.529075]  lkdtm_do_action+0x18/0x20 [lkdtm]
[   17.529075]  direct_entry+0xcc/0x130 [lkdtm]
[   17.529075]  full_proxy_write+0x4f/0x90
[   17.529075]  __vfs_write+0x36/0x140
[   17.529075]  ? security_file_permission+0x36/0xb0
[   17.529075]  vfs_write+0xb1/0x1a0
[   17.529075]  SyS_write+0x46/0xa0
[   17.529075]  entry_SYSCALL_64_fastpath+0x1a/0xaa
[   17.529075] RIP: 0033:0x7f46992a1600
[   17.529075] RSP: 002b:00007fffe4466838 EFLAGS: 00000246 ORIG_RAX:
0000000000000001
[   17.529075] RAX: ffffffffffffffda RBX: 000000000000006b RCX: 00007f46992a1600
[   17.529075] RDX: 0000000000000016 RSI: 00000000021a9610 RDI: 0000000000000001
[   17.529075] RBP: 0000000000002710 R08: 0000000000000003 R09: 0000000000002010
[   17.529075] R10: 0000000000000871 R11: 0000000000000246 R12: 00007f469955eb58
[   17.529075] R13: 0000000000002010 R14: 00000000021a9600 R15: 00007f469955eb00
[   17.529075] Code: 8d 4e 10 48 39 c8 73 0f 25 ff 3f 00 00 48 3d ff 03 00 00 76
16 c9 c3 48 89 82 30 0a 00 00 25 ff 3f 00 00 48 3d ff 03 00 00 77 ea <0f> 0b 66
90 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 54 53
[   17.529075] RIP: track_stack+0x52/0x60 RSP: ffff88007a904078
[   17.529075] ---[ end trace 3f0d8f585f4dcc08 ]---


But printing this message spoils the memory below the exhausted stack, right?

Can we somehow handle this BUG() in another stack to avoid the recursion?

Thanks!

> +#endif /* !CONFIG_VMAP_STACK */
> +}
> +EXPORT_SYMBOL(track_stack);
> +#endif /* CONFIG_GCC_PLUGIN_STACKLEAK */

[...]

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v5 2/5] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
  2017-10-30 16:51   ` [kernel-hardening] " Alexander Popov
@ 2017-10-30 17:32     ` Peter Zijlstra
  2017-10-30 18:06       ` Alexander Popov
  0 siblings, 1 reply; 24+ messages in thread
From: Peter Zijlstra @ 2017-10-30 17:32 UTC (permalink / raw)
  To: Alexander Popov
  Cc: kernel-hardening, keescook, pageexec, spender, Ingo Molnar,
	Andy Lutomirski, tycho, Laura Abbott, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, x86

On Mon, Oct 30, 2017 at 07:51:33PM +0300, Alexander Popov wrote:
> When the thread stack is exhausted, this BUG() is hit. But do_error_trap(),
> which handles the exception, calls track_stack() itself again (since it is
> instrumented by the gcc plugin). So this recursion proceeds with exhausting the
> thread stack.

Add a __attribute__((nostacktrack)) on it?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v5 2/5] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
  2017-10-30 17:32     ` Peter Zijlstra
@ 2017-10-30 18:06       ` Alexander Popov
  2017-11-14 15:36         ` Alexander Popov
  0 siblings, 1 reply; 24+ messages in thread
From: Alexander Popov @ 2017-10-30 18:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: kernel-hardening, keescook, pageexec, spender, Ingo Molnar,
	Andy Lutomirski, tycho, Laura Abbott, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, x86

Hello Peter,

Thanks for your reply.

On 30.10.2017 20:32, Peter Zijlstra wrote:
> On Mon, Oct 30, 2017 at 07:51:33PM +0300, Alexander Popov wrote:
>> When the thread stack is exhausted, this BUG() is hit. But do_error_trap(),
>> which handles the exception, calls track_stack() itself again (since it is
>> instrumented by the gcc plugin). So this recursion proceeds with exhausting the
>> thread stack.
> 
> Add a __attribute__((nostacktrack)) on it?

Yes, I already tried some blacklisting in the plugin, but it didn't really help,
because:

1. there are other (more than 5) instrumented functions, that are called during
BUG() handling too;

2. decreasing CONFIG_STACKLEAK_TRACK_MIN_SIZE would add more instrumented
functions, which should be manually blacklisted (not good).

I guess handling BUG() in another stack would be a solution. For example, Andy
Lutomirski calls handle_stack_overflow in the DOUBLEFAULT_STACK
(arch/x86/mm/fault.c). Should I do something similar?

Thanks!

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v5 1/5] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2017-10-24 21:30     ` Alexander Popov
@ 2017-10-31 15:20       ` Kees Cook
  2017-11-10 16:59         ` Alexander Popov
  0 siblings, 1 reply; 24+ messages in thread
From: Kees Cook @ 2017-10-31 15:20 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Tycho Andersen, Andy Lutomirski, kernel-hardening, PaX Team,
	Brad Spengler, Ingo Molnar, Laura Abbott, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, Peter Zijlstra, x86

On Tue, Oct 24, 2017 at 2:30 PM, Alexander Popov <alex.popov@linux.com> wrote:
> On 23.10.2017 16:17, Tycho Andersen wrote:
>> On Sun, Oct 22, 2017 at 03:22:49AM +0300, Alexander Popov wrote:
>>> The STACKLEAK feature erases the kernel stack before returning from
>>> syscalls. That reduces the information which kernel stack leak bugs can
>>> reveal and blocks some uninitialized stack variable attacks. Moreover,
>>> STACKLEAK provides runtime checks for kernel stack overflow detection.
>>>
>>> This commit introduces the architecture-specific code filling the used
>>> part of the kernel stack with a poison value before returning to the
>>> userspace. Full STACKLEAK feature also contains the gcc plugin which
>>> comes in a separate commit.
>>>
>>> The STACKLEAK feature is ported from grsecurity/PaX. More information at:
>>>   https://grsecurity.net/
>>>   https://pax.grsecurity.net/
>>>
>>> This code is modified from Brad Spengler/PaX Team's code in the last
>>> public patch of grsecurity/PaX based on our understanding of the code.
>>> Changes or omissions from the original code are ours and don't reflect
>>> the original grsecurity/PaX code.
>>>
>>> Signed-off-by: Alexander Popov <alex.popov@linux.com>
>>> ---
>
> [...]
>
>>> @@ -81,8 +87,10 @@ static long syscall_trace_enter(struct pt_regs *regs)
>>>              emulated = true;
>>>
>>>      if ((emulated || (work & _TIF_SYSCALL_TRACE)) &&
>>> -        tracehook_report_syscall_entry(regs))
>>> +        tracehook_report_syscall_entry(regs)) {
>>> +            erase_kstack();
>>>              return -1L;
>>> +    }
>>>
>>>      if (emulated)
>>>              return -1L;
>>> @@ -116,9 +124,11 @@ static long syscall_trace_enter(struct pt_regs *regs)
>>>                      sd.args[5] = regs->bp;
>>>              }
>>>
>>> -            ret = __secure_computing(&sd);
>>> -            if (ret == -1)
>>> +            ret = secure_computing(&sd);
>>
>> Is there any reason to switch this from the __ version? This basically
>> adds an additional check on the TIF_SECCOMP flag, but I'm not sure
>> that's intentional with this patch.
>
> Hello Tycho, thanks for your remark!
>
> Initially I took this change from the grsecurity patch, because it looked
> reasonable for me at that time. But now I doubt, thank you.

Yeah, I'd prefer this stay __secure_computing().

> Kees and Andy (Lutomirski), you are the authors of syscall_trace_enter(). Could
> you please have a look at this change?
>
> By the way, it seems that one erase_kstack() call is missing in that function.
> Could you please have a glance at the places where erase_kstack() is called?

Errr, wouldn't erase_kstack() get called outside of seccomp? (i.e. via
syscall_return_slowpath() or something later in the return path?)

Or is there some reason for erasing the stack after seccomp processing
but before running the syscall?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v5 1/5] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2017-10-31 15:20       ` Kees Cook
@ 2017-11-10 16:59         ` Alexander Popov
  0 siblings, 0 replies; 24+ messages in thread
From: Alexander Popov @ 2017-11-10 16:59 UTC (permalink / raw)
  To: Kees Cook
  Cc: Tycho Andersen, Andy Lutomirski, kernel-hardening, PaX Team,
	Brad Spengler, Ingo Molnar, Laura Abbott, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, Peter Zijlstra, x86

Hello Kees,

On 31.10.2017 18:20, Kees Cook wrote:
> On Tue, Oct 24, 2017 at 2:30 PM, Alexander Popov <alex.popov@linux.com> wrote:
>> On 23.10.2017 16:17, Tycho Andersen wrote:
>>> On Sun, Oct 22, 2017 at 03:22:49AM +0300, Alexander Popov wrote:
>>>> The STACKLEAK feature erases the kernel stack before returning from
>>>> syscalls. That reduces the information which kernel stack leak bugs can
>>>> reveal and blocks some uninitialized stack variable attacks. Moreover,
>>>> STACKLEAK provides runtime checks for kernel stack overflow detection.
>>>>
>>>> This commit introduces the architecture-specific code filling the used
>>>> part of the kernel stack with a poison value before returning to the
>>>> userspace. Full STACKLEAK feature also contains the gcc plugin which
>>>> comes in a separate commit.
>>>>
>>>> The STACKLEAK feature is ported from grsecurity/PaX. More information at:
>>>>   https://grsecurity.net/
>>>>   https://pax.grsecurity.net/
>>>>
>>>> This code is modified from Brad Spengler/PaX Team's code in the last
>>>> public patch of grsecurity/PaX based on our understanding of the code.
>>>> Changes or omissions from the original code are ours and don't reflect
>>>> the original grsecurity/PaX code.
>>>>
>>>> Signed-off-by: Alexander Popov <alex.popov@linux.com>
>>>> ---
>>
>> [...]
>>
>>>> @@ -81,8 +87,10 @@ static long syscall_trace_enter(struct pt_regs *regs)
>>>>              emulated = true;
>>>>
>>>>      if ((emulated || (work & _TIF_SYSCALL_TRACE)) &&
>>>> -        tracehook_report_syscall_entry(regs))
>>>> +        tracehook_report_syscall_entry(regs)) {
>>>> +            erase_kstack();
>>>>              return -1L;
>>>> +    }
>>>>
>>>>      if (emulated)
>>>>              return -1L;
>>>> @@ -116,9 +124,11 @@ static long syscall_trace_enter(struct pt_regs *regs)
>>>>                      sd.args[5] = regs->bp;
>>>>              }
>>>>
>>>> -            ret = __secure_computing(&sd);
>>>> -            if (ret == -1)
>>>> +            ret = secure_computing(&sd);
>>>
>>> Is there any reason to switch this from the __ version? This basically
>>> adds an additional check on the TIF_SECCOMP flag, but I'm not sure
>>> that's intentional with this patch.
>>
>> Hello Tycho, thanks for your remark!
>>
>> Initially I took this change from the grsecurity patch, because it looked
>> reasonable for me at that time. But now I doubt, thank you.
> 
> Yeah, I'd prefer this stay __secure_computing().

Ok.

>> Kees and Andy (Lutomirski), you are the authors of syscall_trace_enter(). Could
>> you please have a look at this change?
>>
>> By the way, it seems that one erase_kstack() call is missing in that function.
>> Could you please have a glance at the places where erase_kstack() is called?
> 
> Errr, wouldn't erase_kstack() get called outside of seccomp? (i.e. via
> syscall_return_slowpath() or something later in the return path?)

Yes, erase_kstack() is called later in entry_SYSCALL_64 just after do_syscall_64.

> Or is there some reason for erasing the stack after seccomp processing
> but before running the syscall?

Ah, it seems that PaX Team is doing that intentionally. I guess I've found the
proof in his slides for H2HC conference (slide 38):
https://pax.grsecurity.net/docs/PaXTeam-H2HC13-PaX-gcc-plugins.pdf

"- Special paths for ptrace/auditing
-- Low-level kernel entry/exit paths can diverge for ptrace/auditing and leave
interesting information on the stack for the actual syscall code"

Do you know which kind of interesting information is mentioned?

And again, erase_kstack() is not called before 1 of 4 return statements in
syscall_trace_enter(). IMHO it is a bug. Could you please verify that? Thanks!

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v5 2/5] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
  2017-10-30 18:06       ` Alexander Popov
@ 2017-11-14 15:36         ` Alexander Popov
  2017-11-14 16:13           ` Andy Lutomirski
  0 siblings, 1 reply; 24+ messages in thread
From: Alexander Popov @ 2017-11-14 15:36 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: kernel-hardening, keescook, pageexec, spender, Ingo Molnar,
	Peter Zijlstra, tycho, Laura Abbott, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, x86

On 30.10.2017 21:06, Alexander Popov wrote:
> On 30.10.2017 20:32, Peter Zijlstra wrote:
>> On Mon, Oct 30, 2017 at 07:51:33PM +0300, Alexander Popov wrote:
>>> When the thread stack is exhausted, this BUG() is hit. But do_error_trap(),
>>> which handles the exception, calls track_stack() itself again (since it is
>>> instrumented by the gcc plugin). So this recursion proceeds with exhausting the
>>> thread stack.
>>
>> Add a __attribute__((nostacktrack)) on it?
> 
> Yes, I already tried some blacklisting in the plugin, but it didn't really help,
> because:
> 
> 1. there are other (more than 5) instrumented functions, that are called during
> BUG() handling too;
> 
> 2. decreasing CONFIG_STACKLEAK_TRACK_MIN_SIZE would add more instrumented
> functions, which should be manually blacklisted (not good).
> 
> I guess handling BUG() in another stack would be a solution. For example, Andy
> Lutomirski calls handle_stack_overflow in the DOUBLEFAULT_STACK
> (arch/x86/mm/fault.c). Should I do something similar?

Hello Andy! May I ask your advice?

When CONFIG_VMAP_STACK is disabled and STACKLEAK is enabled (for example, on
x86_32), we need another way to detect stack depth overflow. That is the reason
of having this BUG() in track_stack(). But it turns out to be recursive since
track_stack() will be called again during BUG() handling.

We can avoid that recursion by handling oops in another stack. It looks similar
to the way you call handle_stack_overflow() in arch/x86/mm/fault.c. But it seems
that I can't reuse that code, am I right?

How should I do it properly?

By the way, you wrote that you have some entry code changes which conflict with
STACKLEAK. May I ask for more details?

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v5 2/5] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
  2017-11-14 15:36         ` Alexander Popov
@ 2017-11-14 16:13           ` Andy Lutomirski
  2017-11-14 16:33             ` Mark Rutland
  2017-11-14 21:50             ` Alexander Popov
  0 siblings, 2 replies; 24+ messages in thread
From: Andy Lutomirski @ 2017-11-14 16:13 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Andy Lutomirski, kernel-hardening, Kees Cook, PaX Team,
	Brad Spengler, Ingo Molnar, Peter Zijlstra, Tycho Andersen,
	Laura Abbott, Mark Rutland, Ard Biesheuvel, Borislav Petkov,
	Thomas Gleixner, H . Peter Anvin, X86 ML

On Tue, Nov 14, 2017 at 7:36 AM, Alexander Popov <alex.popov@linux.com> wrote:
> On 30.10.2017 21:06, Alexander Popov wrote:
>> On 30.10.2017 20:32, Peter Zijlstra wrote:
>>> On Mon, Oct 30, 2017 at 07:51:33PM +0300, Alexander Popov wrote:
>>>> When the thread stack is exhausted, this BUG() is hit. But do_error_trap(),
>>>> which handles the exception, calls track_stack() itself again (since it is
>>>> instrumented by the gcc plugin). So this recursion proceeds with exhausting the
>>>> thread stack.
>>>
>>> Add a __attribute__((nostacktrack)) on it?
>>
>> Yes, I already tried some blacklisting in the plugin, but it didn't really help,
>> because:
>>
>> 1. there are other (more than 5) instrumented functions, that are called during
>> BUG() handling too;
>>
>> 2. decreasing CONFIG_STACKLEAK_TRACK_MIN_SIZE would add more instrumented
>> functions, which should be manually blacklisted (not good).
>>
>> I guess handling BUG() in another stack would be a solution. For example, Andy
>> Lutomirski calls handle_stack_overflow in the DOUBLEFAULT_STACK
>> (arch/x86/mm/fault.c). Should I do something similar?
>
> Hello Andy! May I ask your advice?
>
> When CONFIG_VMAP_STACK is disabled and STACKLEAK is enabled (for example, on
> x86_32), we need another way to detect stack depth overflow. That is the reason
> of having this BUG() in track_stack(). But it turns out to be recursive since
> track_stack() will be called again during BUG() handling.

What does the STEAKLACK plugin actually do?  I haven't followed this enough.

>
> We can avoid that recursion by handling oops in another stack. It looks similar
> to the way you call handle_stack_overflow() in arch/x86/mm/fault.c. But it seems
> that I can't reuse that code, am I right?

You'd probably have to make 32-bit compatible, which means making a
32-bit variant of this thingy:

                asm volatile ("movq %[stack], %%rsp\n\t"
                              "call handle_stack_overflow\n\t"
                              "1: jmp 1b"
                              : ASM_CALL_CONSTRAINT
                              : "D" ("kernel stack overflow (page fault)"),
                                "S" (regs), "d" (address),
                                [stack] "rm" (stack));

Or you could force a double-fault.

>
> How should I do it properly?
>
> By the way, you wrote that you have some entry code changes which conflict with
> STACKLEAK. May I ask for more details?

It's this thing:

https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/log/?h=x86/entry_stack.wip

and I'll probably drop the ".wip" from the name shortly.

>
> Best regards,
> Alexander

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v5 2/5] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
  2017-11-14 16:13           ` Andy Lutomirski
@ 2017-11-14 16:33             ` Mark Rutland
  2017-11-14 21:09               ` Alexander Popov
  2017-11-14 21:50             ` Alexander Popov
  1 sibling, 1 reply; 24+ messages in thread
From: Mark Rutland @ 2017-11-14 16:33 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Alexander Popov, kernel-hardening, Kees Cook, PaX Team,
	Brad Spengler, Ingo Molnar, Peter Zijlstra, Tycho Andersen,
	Laura Abbott, Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, X86 ML

On Tue, Nov 14, 2017 at 08:13:43AM -0800, Andy Lutomirski wrote:
> What does the STEAKLACK plugin actually do?  I haven't followed this enough.

The plugin adds instrumentation to track the maximum stack depth, though only
functions with a sufficiently large stackframe are instrumented.

The basic idea is that this can then be used to lazily zero stack after use
(just before return to userspace), to minimize the risk of problems resulting
from uninitialized variables (either copied to userspace and leaking secrets,
or controlled by userspace and influencing the kernel).

That, and catching some stack overflows in the absence of VMAP'd stacks.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v5 2/5] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
  2017-11-14 16:33             ` Mark Rutland
@ 2017-11-14 21:09               ` Alexander Popov
  2017-11-14 21:17                 ` Andy Lutomirski
  0 siblings, 1 reply; 24+ messages in thread
From: Alexander Popov @ 2017-11-14 21:09 UTC (permalink / raw)
  To: Mark Rutland, Andy Lutomirski
  Cc: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Ingo Molnar, Peter Zijlstra, Tycho Andersen, Laura Abbott,
	Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, X86 ML

Thanks, Mark!

Please see my comments below.

On 14.11.2017 19:33, Mark Rutland wrote:
> On Tue, Nov 14, 2017 at 08:13:43AM -0800, Andy Lutomirski wrote:
>> What does the STEAKLACK plugin actually do?  I haven't followed this enough.
> 
> The plugin adds instrumentation to track the maximum stack depth, though only
> functions with a sufficiently large stackframe are instrumented.

Yes. Functions with a big stack frame call track_stack() to update the
lowest_stack value. If CONFIG_VMAP_STACK is disabled, track_stack() is compiled
with a check for detecting stack depth overflow. This check is what I'm asking
about.

> The basic idea is that this can then be used to lazily zero stack after use
> (just before return to userspace), to minimize the risk of problems resulting
> from uninitialized variables (either copied to userspace and leaking secrets,
> or controlled by userspace and influencing the kernel).

Yes.

Actually the used part of the kernel stack is not zeroed. It is filled by
STACKLEAK_POISON (-0xBEEF) which points to the unused hole in x86_64 virtual
memory map.

> That, and catching some stack overflows in the absence of VMAP'd stacks.

The STACKLEAK plugin also adds check_alloca() call before each alloca to block
"Stack Clash" attack against kernel stack.

More details and statistics are available in the cover letter and commit
messages in this patch series.

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v5 2/5] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
  2017-11-14 21:09               ` Alexander Popov
@ 2017-11-14 21:17                 ` Andy Lutomirski
  2017-11-14 22:03                   ` Alexander Popov
  0 siblings, 1 reply; 24+ messages in thread
From: Andy Lutomirski @ 2017-11-14 21:17 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Mark Rutland, Andy Lutomirski, kernel-hardening, Kees Cook,
	PaX Team, Brad Spengler, Ingo Molnar, Peter Zijlstra,
	Tycho Andersen, Laura Abbott, Ard Biesheuvel, Borislav Petkov,
	Thomas Gleixner, H . Peter Anvin, X86 ML

On Tue, Nov 14, 2017 at 1:09 PM, Alexander Popov <alex.popov@linux.com> wrote:
> Thanks, Mark!
>
> Please see my comments below.
>
> On 14.11.2017 19:33, Mark Rutland wrote:
>> On Tue, Nov 14, 2017 at 08:13:43AM -0800, Andy Lutomirski wrote:
>>> What does the STEAKLACK plugin actually do?  I haven't followed this enough.
>>
>> The plugin adds instrumentation to track the maximum stack depth, though only
>> functions with a sufficiently large stackframe are instrumented.
>
> Yes. Functions with a big stack frame call track_stack() to update the
> lowest_stack value. If CONFIG_VMAP_STACK is disabled, track_stack() is compiled
> with a check for detecting stack depth overflow. This check is what I'm asking
> about.

Then you'll probably have to do something like what I did in the
VMAP_STACK code.

That being said, I don't entirely see the point.  If you want a
hardened kernel, you're going to enable VMAP_STACK.  Are there really
users of hardened 32-bit kernels?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v5 2/5] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
  2017-11-14 16:13           ` Andy Lutomirski
  2017-11-14 16:33             ` Mark Rutland
@ 2017-11-14 21:50             ` Alexander Popov
  1 sibling, 0 replies; 24+ messages in thread
From: Alexander Popov @ 2017-11-14 21:50 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Ingo Molnar, Peter Zijlstra, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, X86 ML

Hello Andy,

Thanks for your prompt reply!

On 14.11.2017 19:13, Andy Lutomirski wrote:
> On Tue, Nov 14, 2017 at 7:36 AM, Alexander Popov <alex.popov@linux.com> wrote:
>> On 30.10.2017 21:06, Alexander Popov wrote:
>>> On 30.10.2017 20:32, Peter Zijlstra wrote:
>>>> On Mon, Oct 30, 2017 at 07:51:33PM +0300, Alexander Popov wrote:
>>>>> When the thread stack is exhausted, this BUG() is hit. But do_error_trap(),
>>>>> which handles the exception, calls track_stack() itself again (since it is
>>>>> instrumented by the gcc plugin). So this recursion proceeds with exhausting the
>>>>> thread stack.
>>>>
>>>> Add a __attribute__((nostacktrack)) on it?
>>>
>>> Yes, I already tried some blacklisting in the plugin, but it didn't really help,
>>> because:
>>>
>>> 1. there are other (more than 5) instrumented functions, that are called during
>>> BUG() handling too;
>>>
>>> 2. decreasing CONFIG_STACKLEAK_TRACK_MIN_SIZE would add more instrumented
>>> functions, which should be manually blacklisted (not good).
>>>
>>> I guess handling BUG() in another stack would be a solution. For example, Andy
>>> Lutomirski calls handle_stack_overflow in the DOUBLEFAULT_STACK
>>> (arch/x86/mm/fault.c). Should I do something similar?
>>
>> Hello Andy! May I ask your advice?
>>
>> When CONFIG_VMAP_STACK is disabled and STACKLEAK is enabled (for example, on
>> x86_32), we need another way to detect stack depth overflow. That is the reason
>> of having this BUG() in track_stack(). But it turns out to be recursive since
>> track_stack() will be called again during BUG() handling.
> 
> What does the STEAKLACK plugin actually do?  I haven't followed this enough.

I've just replied to Mark's explanation.

>> We can avoid that recursion by handling oops in another stack. It looks similar
>> to the way you call handle_stack_overflow() in arch/x86/mm/fault.c. But it seems
>> that I can't reuse that code, am I right?
> 
> You'd probably have to make 32-bit compatible, which means making a
> 32-bit variant of this thingy:
> 
>                 asm volatile ("movq %[stack], %%rsp\n\t"
>                               "call handle_stack_overflow\n\t"
>                               "1: jmp 1b"
>                               : ASM_CALL_CONSTRAINT
>                               : "D" ("kernel stack overflow (page fault)"),
>                                 "S" (regs), "d" (address),
>                                 [stack] "rm" (stack));

Hm, I don't have these pt_regs in track_stack(). That is why I think I can't
reuse your handle_stack_overflow(). I guess manually crafting pt_regs will not
be good-looking.

> Or you could force a double-fault.

Could you elaborate on that?

The goal is to have a verbose oops message and kill the offending process (if we
work on behalf of a process). Can I do that?

>> How should I do it properly?
>>
>> By the way, you wrote that you have some entry code changes which conflict with
>> STACKLEAK. May I ask for more details?
> 
> It's this thing:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/log/?h=x86/entry_stack.wip
> 
> and I'll probably drop the ".wip" from the name shortly.

Wow, it's big. I'll look into it and maybe return with questions.

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [kernel-hardening] Re: [PATCH RFC v5 2/5] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
  2017-11-14 21:17                 ` Andy Lutomirski
@ 2017-11-14 22:03                   ` Alexander Popov
  0 siblings, 0 replies; 24+ messages in thread
From: Alexander Popov @ 2017-11-14 22:03 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Mark Rutland, kernel-hardening, Kees Cook, PaX Team,
	Brad Spengler, Ingo Molnar, Peter Zijlstra, Tycho Andersen,
	Laura Abbott, Ard Biesheuvel, Borislav Petkov, Thomas Gleixner,
	H . Peter Anvin, X86 ML

On 15.11.2017 00:17, Andy Lutomirski wrote:
> On Tue, Nov 14, 2017 at 1:09 PM, Alexander Popov <alex.popov@linux.com> wrote:
>> Thanks, Mark!
>>
>> Please see my comments below.
>>
>> On 14.11.2017 19:33, Mark Rutland wrote:
>>> On Tue, Nov 14, 2017 at 08:13:43AM -0800, Andy Lutomirski wrote:
>>>> What does the STEAKLACK plugin actually do?  I haven't followed this enough.
>>>
>>> The plugin adds instrumentation to track the maximum stack depth, though only
>>> functions with a sufficiently large stackframe are instrumented.
>>
>> Yes. Functions with a big stack frame call track_stack() to update the
>> lowest_stack value. If CONFIG_VMAP_STACK is disabled, track_stack() is compiled
>> with a check for detecting stack depth overflow. This check is what I'm asking
>> about.
> 
> Then you'll probably have to do something like what I did in the
> VMAP_STACK code.

Yes!

> That being said, I don't entirely see the point.  If you want a
> hardened kernel, you're going to enable VMAP_STACK.  Are there really
> users of hardened 32-bit kernels?

You know, STACKLEAK already supports x86_32. It's a pity for me to make
STACKLEAK dependent on VMAP_STACK and hence to drop STACKLEAK support for this
platform.

I hope there is a way to add a good-looking check to track_stack() and have at
least some profit (although it will not catch all overflow cases).

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2017-11-14 22:03 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-22  0:22 [kernel-hardening] [PATCH RFC v5 0/5] Introduce the STACKLEAK feature and a test for it Alexander Popov
2017-10-22  0:22 ` [kernel-hardening] [PATCH RFC v5 1/5] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls Alexander Popov
2017-10-23 13:17   ` [kernel-hardening] " Tycho Andersen
2017-10-24 21:30     ` Alexander Popov
2017-10-31 15:20       ` Kees Cook
2017-11-10 16:59         ` Alexander Popov
2017-10-22  0:22 ` [kernel-hardening] [PATCH RFC v5 2/5] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack Alexander Popov
2017-10-30 16:51   ` [kernel-hardening] " Alexander Popov
2017-10-30 17:32     ` Peter Zijlstra
2017-10-30 18:06       ` Alexander Popov
2017-11-14 15:36         ` Alexander Popov
2017-11-14 16:13           ` Andy Lutomirski
2017-11-14 16:33             ` Mark Rutland
2017-11-14 21:09               ` Alexander Popov
2017-11-14 21:17                 ` Andy Lutomirski
2017-11-14 22:03                   ` Alexander Popov
2017-11-14 21:50             ` Alexander Popov
2017-10-22  0:22 ` [kernel-hardening] [PATCH RFC v5 3/5] lkdtm: Add a test for STACKLEAK Alexander Popov
2017-10-22  0:22 ` [kernel-hardening] [PATCH RFC v5 4/5] fs/proc: Show STACKLEAK metrics in the /proc file system Alexander Popov
2017-10-22  0:22 ` [kernel-hardening] [PATCH RFC v5 5/5] doc: self-protection: Add information about STACKLEAK feature Alexander Popov
2017-10-22 13:11 ` [kernel-hardening] Re: [PATCH RFC v5 0/5] Introduce the STACKLEAK feature and a test for it Peter Zijlstra
2017-10-23  9:08   ` Mark Rutland
2017-10-23 12:11     ` Alexander Popov
2017-10-23 11:21   ` Alexander Popov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.