All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v13 (resend) 0/6] Introduce the STACKLEAK feature and a test for it
@ 2018-06-22 16:58 Alexander Popov
  2018-06-22 16:58 ` [PATCH v13 (resend) 1/6] gcc-plugins: Clean up the cgraph_create_edge* macros Alexander Popov
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Alexander Popov @ 2018-06-22 16:58 UTC (permalink / raw)
  To: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Linus Torvalds,
	Greg Kroah-Hartman, Dan Williams, Dave Hansen, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon,
	Arnd Bergmann, Florian Weimer, Boris Lukashev, Andrey Konovalov,
	x86, linux-kernel, alex.popov

This is the rebased 13th version of the patch series introducing STACKLEAK
to the mainline kernel for x86. Ingo Molnar asked to resend it.
This version comes with style changes according to the feedback from Ingo.
Previous version discussion:
http://www.openwall.com/lists/kernel-hardening/2018/05/16/1

arm64 support will come in a separate patch from Laura Abbott.

Motivation
==========

STACKLEAK (initially developed by PaX Team):

 1. reduces the information that can be revealed through kernel stack leak bugs.
    The idea of erasing the thread stack at the end of syscalls is similar to
    CONFIG_PAGE_POISONING and memzero_explicit() in kernel crypto, which all
    comply with FDP_RIP.2 (Full Residual Information Protection) of the
    Common Criteria standard.

 2. blocks some uninitialized stack variable attacks (e.g. CVE-2017-17712,
    CVE-2010-2963). That kind of bugs should be killed by improving C compilers
    in future, which might take a long time.

 3. blocks stack depth overflow caused by alloca (aka Stack Clash attack).
    That is orthogonal to the mainline kernel VLA cleanup and protects
    un-upstreamed code.

Performance impact
==================

Hardware: Intel Core i7-4770, 16 GB RAM

Test #1: building the Linux kernel on a single core
	0.91% slowdown

Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P
	4.2% slowdown

So the STACKLEAK description in Kconfig includes:
"The tradeoff is the performance impact: on a single CPU system kernel
compilation sees a 1% slowdown, other systems and workloads may vary and you are
advised to test this feature on your expected workload before deploying it".


Alexander Popov (6):
  gcc-plugins: Clean up the cgraph_create_edge* macros
  x86/entry: Add STACKLEAK erasing the kernel stack at the end of
    syscalls
  gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
  lkdtm: Add a test for STACKLEAK
  fs/proc: Show STACKLEAK metrics in the /proc file system
  doc: self-protection: Add information about STACKLEAK feature

 Documentation/security/self-protection.rst |  23 +-
 Documentation/x86/x86_64/mm.txt            |   2 +
 arch/Kconfig                               |  53 ++++
 arch/x86/Kconfig                           |   1 +
 arch/x86/entry/calling.h                   |  14 +
 arch/x86/entry/entry_32.S                  |   7 +
 arch/x86/entry/entry_64.S                  |   3 +
 arch/x86/entry/entry_64_compat.S           |   5 +
 arch/x86/kernel/dumpstack.c                |  31 ++
 drivers/misc/lkdtm/Makefile                |   2 +
 drivers/misc/lkdtm/core.c                  |   3 +
 drivers/misc/lkdtm/lkdtm.h                 |   5 +
 drivers/misc/lkdtm/stackleak.c             | 146 +++++++++
 fs/proc/base.c                             |  18 ++
 include/linux/sched.h                      |   5 +
 include/linux/stackleak.h                  |  27 ++
 kernel/Makefile                            |   4 +
 kernel/fork.c                              |   3 +
 kernel/stackleak.c                         |  96 ++++++
 scripts/Makefile.gcc-plugins               |   3 +
 scripts/gcc-plugins/gcc-common.h           |  26 +-
 scripts/gcc-plugins/stackleak_plugin.c     | 480 +++++++++++++++++++++++++++++
 22 files changed, 938 insertions(+), 19 deletions(-)
 create mode 100644 drivers/misc/lkdtm/stackleak.c
 create mode 100644 include/linux/stackleak.h
 create mode 100644 kernel/stackleak.c
 create mode 100644 scripts/gcc-plugins/stackleak_plugin.c

-- 
2.7.4

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v13 (resend) 1/6] gcc-plugins: Clean up the cgraph_create_edge* macros
  2018-06-22 16:58 [PATCH v13 (resend) 0/6] Introduce the STACKLEAK feature and a test for it Alexander Popov
@ 2018-06-22 16:58 ` Alexander Popov
  2018-06-22 16:58 ` [PATCH v13 (resend) 2/6] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls Alexander Popov
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Alexander Popov @ 2018-06-22 16:58 UTC (permalink / raw)
  To: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Linus Torvalds,
	Greg Kroah-Hartman, Dan Williams, Dave Hansen, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon,
	Arnd Bergmann, Florian Weimer, Boris Lukashev, Andrey Konovalov,
	x86, linux-kernel, alex.popov

Drop useless redefinitions of cgraph_create_edge* macros. Drop the unused
nest argument. Also support gcc-8, which doesn't have freq argument.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 scripts/gcc-plugins/gcc-common.h | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/scripts/gcc-plugins/gcc-common.h b/scripts/gcc-plugins/gcc-common.h
index f467500..552d5ef 100644
--- a/scripts/gcc-plugins/gcc-common.h
+++ b/scripts/gcc-plugins/gcc-common.h
@@ -392,13 +392,6 @@ static inline struct cgraph_node *cgraph_alias_target(struct cgraph_node *n)
 }
 #endif
 
-#if BUILDING_GCC_VERSION >= 4007 && BUILDING_GCC_VERSION <= 4009
-#define cgraph_create_edge(caller, callee, call_stmt, count, freq, nest) \
-	cgraph_create_edge((caller), (callee), (call_stmt), (count), (freq))
-#define cgraph_create_edge_including_clones(caller, callee, old_call_stmt, call_stmt, count, freq, nest, reason) \
-	cgraph_create_edge_including_clones((caller), (callee), (old_call_stmt), (call_stmt), (count), (freq), (reason))
-#endif
-
 #if BUILDING_GCC_VERSION <= 4008
 #define ENTRY_BLOCK_PTR_FOR_FN(FN)	ENTRY_BLOCK_PTR_FOR_FUNCTION(FN)
 #define EXIT_BLOCK_PTR_FOR_FN(FN)	EXIT_BLOCK_PTR_FOR_FUNCTION(FN)
@@ -723,10 +716,23 @@ static inline const char *get_decl_section_name(const_tree decl)
 #define varpool_get_node(decl) varpool_node::get(decl)
 #define dump_varpool_node(file, node) (node)->dump(file)
 
-#define cgraph_create_edge(caller, callee, call_stmt, count, freq, nest) \
+#if BUILDING_GCC_VERSION >= 8000
+#define cgraph_create_edge(caller, callee, call_stmt, count, freq) \
+	(caller)->create_edge((callee), (call_stmt), (count))
+
+#define cgraph_create_edge_including_clones(caller, callee,	\
+		old_call_stmt, call_stmt, count, freq, reason)	\
+	(caller)->create_edge_including_clones((callee),	\
+		(old_call_stmt), (call_stmt), (count), (reason))
+#else
+#define cgraph_create_edge(caller, callee, call_stmt, count, freq) \
 	(caller)->create_edge((callee), (call_stmt), (count), (freq))
-#define cgraph_create_edge_including_clones(caller, callee, old_call_stmt, call_stmt, count, freq, nest, reason) \
-	(caller)->create_edge_including_clones((callee), (old_call_stmt), (call_stmt), (count), (freq), (reason))
+
+#define cgraph_create_edge_including_clones(caller, callee,	\
+		old_call_stmt, call_stmt, count, freq, reason)	\
+	(caller)->create_edge_including_clones((callee),	\
+		(old_call_stmt), (call_stmt), (count), (freq), (reason))
+#endif
 
 typedef struct cgraph_node *cgraph_node_ptr;
 typedef struct cgraph_edge *cgraph_edge_p;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v13 (resend) 2/6] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-06-22 16:58 [PATCH v13 (resend) 0/6] Introduce the STACKLEAK feature and a test for it Alexander Popov
  2018-06-22 16:58 ` [PATCH v13 (resend) 1/6] gcc-plugins: Clean up the cgraph_create_edge* macros Alexander Popov
@ 2018-06-22 16:58 ` Alexander Popov
  2018-06-22 18:58   ` Laura Abbott
  2018-07-05  8:12   ` Ingo Molnar
  2018-06-22 16:58 ` [PATCH v13 (resend) 3/6] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack Alexander Popov
                   ` (3 subsequent siblings)
  5 siblings, 2 replies; 17+ messages in thread
From: Alexander Popov @ 2018-06-22 16:58 UTC (permalink / raw)
  To: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Linus Torvalds,
	Greg Kroah-Hartman, Dan Williams, Dave Hansen, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon,
	Arnd Bergmann, Florian Weimer, Boris Lukashev, Andrey Konovalov,
	x86, linux-kernel, alex.popov

The STACKLEAK feature erases the kernel stack before returning from
syscalls. That reduces the information which kernel stack leak bugs can
reveal and blocks some uninitialized stack variable attacks. Moreover,
STACKLEAK blocks kernel stack depth overflow caused by alloca (aka
Stack Clash attack).

This commit introduces the code filling the used part of the kernel
stack with a poison value before returning to the userspace. Full
STACKLEAK feature also contains the gcc plugin which comes in a
separate commit.

The STACKLEAK feature is ported from grsecurity/PaX. More information at:
  https://grsecurity.net/
  https://pax.grsecurity.net/

This code is modified from Brad Spengler/PaX Team's code in the last
public patch of grsecurity/PaX based on our understanding of the code.
Changes or omissions from the original code are ours and don't reflect
the original grsecurity/PaX code.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
---
 Documentation/x86/x86_64/mm.txt  |  2 ++
 arch/Kconfig                     | 27 +++++++++++++++++
 arch/x86/Kconfig                 |  1 +
 arch/x86/entry/calling.h         | 14 +++++++++
 arch/x86/entry/entry_32.S        |  7 +++++
 arch/x86/entry/entry_64.S        |  3 ++
 arch/x86/entry/entry_64_compat.S |  5 ++++
 include/linux/sched.h            |  4 +++
 include/linux/stackleak.h        | 24 +++++++++++++++
 kernel/Makefile                  |  4 +++
 kernel/fork.c                    |  3 ++
 kernel/stackleak.c               | 63 ++++++++++++++++++++++++++++++++++++++++
 12 files changed, 157 insertions(+)
 create mode 100644 include/linux/stackleak.h
 create mode 100644 kernel/stackleak.c

diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index 5432a96..600bc2a 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -24,6 +24,7 @@ ffffffffa0000000 - fffffffffeffffff (1520 MB) module mapping space
 [fixmap start]   - ffffffffff5fffff kernel-internal fixmap range
 ffffffffff600000 - ffffffffff600fff (=4 kB) legacy vsyscall ABI
 ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole
+STACKLEAK_POISON value in this last hole: ffffffffffff4111
 
 Virtual memory map with 5 level page tables:
 
@@ -50,6 +51,7 @@ ffffffffa0000000 - fffffffffeffffff (1520 MB) module mapping space
 [fixmap start]   - ffffffffff5fffff kernel-internal fixmap range
 ffffffffff600000 - ffffffffff600fff (=4 kB) legacy vsyscall ABI
 ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole
+STACKLEAK_POISON value in this last hole: ffffffffffff4111
 
 Architecture defines a 64-bit virtual address. Implementations can support
 less. Currently supported are 48- and 57-bit virtual addresses. Bits 63
diff --git a/arch/Kconfig b/arch/Kconfig
index 1aa5906..57817f0 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -414,6 +414,13 @@ config PLUGIN_HOSTCC
 	  Host compiler used to build GCC plugins.  This can be $(HOSTCXX),
 	  $(HOSTCC), or a null string if GCC plugin is unsupported.
 
+config HAVE_ARCH_STACKLEAK
+	bool
+	help
+	  An architecture should select this if it has the code which
+	  fills the used part of the kernel stack with the STACKLEAK_POISON
+	  value before returning from system calls.
+
 config HAVE_GCC_PLUGINS
 	bool
 	help
@@ -549,6 +556,26 @@ config GCC_PLUGIN_RANDSTRUCT_PERFORMANCE
 	  in structures.  This reduces the performance hit of RANDSTRUCT
 	  at the cost of weakened randomization.
 
+config GCC_PLUGIN_STACKLEAK
+	bool "Erase the kernel stack before returning from syscalls"
+	depends on GCC_PLUGINS
+	depends on HAVE_ARCH_STACKLEAK
+	help
+	  This option makes the kernel erase the kernel stack before
+	  returning from system calls. That reduces the information which
+	  kernel stack leak bugs can reveal and blocks some uninitialized
+	  stack variable attacks. This option also blocks kernel stack depth
+	  overflow caused by alloca (aka Stack Clash attack).
+
+	  The tradeoff is the performance impact: on a single CPU system kernel
+	  compilation sees a 1% slowdown, other systems and workloads may vary
+	  and you are advised to test this feature on your expected workload
+	  before deploying it.
+
+	  This plugin was ported from grsecurity/PaX. More information at:
+	   * https://grsecurity.net/
+	   * https://pax.grsecurity.net/
+
 config HAVE_STACKPROTECTOR
 	bool
 	help
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f1dbb4e..3718d70 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -125,6 +125,7 @@ config X86
 	select HAVE_ARCH_COMPAT_MMAP_BASES	if MMU && COMPAT
 	select HAVE_ARCH_SECCOMP_FILTER
 	select HAVE_ARCH_THREAD_STRUCT_WHITELIST
+	select HAVE_ARCH_STACKLEAK
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD if X86_64
diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index 352e70c..ba001e2 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -329,8 +329,22 @@ For 32-bit we have the following conventions - kernel is built with
 
 #endif
 
+.macro STACKLEAK_ERASE_NOCLOBBER
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	PUSH_AND_CLEAR_REGS
+	call stackleak_erase_kstack
+	POP_REGS
+#endif
+.endm
+
 #endif /* CONFIG_X86_64 */
 
+.macro STACKLEAK_ERASE
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	call stackleak_erase_kstack
+#endif
+.endm
+
 /*
  * This does 'call enter_from_user_mode' unless we can avoid it based on
  * kernel config or using the static jump infrastructure.
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 2582881..72ce401 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -46,6 +46,8 @@
 #include <asm/frame.h>
 #include <asm/nospec-branch.h>
 
+#include "calling.h"
+
 	.section .entry.text, "ax"
 
 /*
@@ -298,6 +300,7 @@ ENTRY(ret_from_fork)
 	/* When we fork, we trace the syscall return in the child, too. */
 	movl    %esp, %eax
 	call    syscall_return_slowpath
+	STACKLEAK_ERASE
 	jmp     restore_all
 
 	/* kernel thread */
@@ -458,6 +461,8 @@ ENTRY(entry_SYSENTER_32)
 	ALTERNATIVE "testl %eax, %eax; jz .Lsyscall_32_done", \
 		    "jmp .Lsyscall_32_done", X86_FEATURE_XENPV
 
+	STACKLEAK_ERASE
+
 /* Opportunistic SYSEXIT */
 	TRACE_IRQS_ON			/* User mode traces as IRQs on. */
 	movl	PT_EIP(%esp), %edx	/* pt_regs->ip */
@@ -544,6 +549,8 @@ ENTRY(entry_INT80_32)
 	call	do_int80_syscall_32
 .Lsyscall_32_done:
 
+	STACKLEAK_ERASE
+
 restore_all:
 	TRACE_IRQS_IRET
 .Lrestore_all_notrace:
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 73a522d..408101b 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -329,6 +329,8 @@ syscall_return_via_sysret:
 	 * We are on the trampoline stack.  All regs except RDI are live.
 	 * We can do future final exit work right here.
 	 */
+	STACKLEAK_ERASE_NOCLOBBER
+
 	SWITCH_TO_USER_CR3_STACK scratch_reg=%rdi
 
 	popq	%rdi
@@ -687,6 +689,7 @@ GLOBAL(swapgs_restore_regs_and_return_to_usermode)
 	 * We are on the trampoline stack.  All regs except RDI are live.
 	 * We can do future final exit work right here.
 	 */
+	STACKLEAK_ERASE_NOCLOBBER
 
 	SWITCH_TO_USER_CR3_STACK scratch_reg=%rdi
 
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index 9de7f1e..193c7e6 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -261,6 +261,11 @@ GLOBAL(entry_SYSCALL_compat_after_hwframe)
 
 	/* Opportunistic SYSRET */
 sysret32_from_system_call:
+	/*
+	 * We are not going to return to the userspace from the trampoline
+	 * stack. So let's erase the thread stack right now.
+	 */
+	STACKLEAK_ERASE
 	TRACE_IRQS_ON			/* User mode traces as IRQs on. */
 	movq	RBX(%rsp), %rbx		/* pt_regs->rbx */
 	movq	RBP(%rsp), %rbp		/* pt_regs->rbp */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 87bf02d..89c4cba 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1180,6 +1180,10 @@ struct task_struct {
 	void				*security;
 #endif
 
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	unsigned long			lowest_stack;
+#endif
+
 	/*
 	 * New fields for task_struct should be added above here, so that
 	 * they are included in the randomized portion of task_struct.
diff --git a/include/linux/stackleak.h b/include/linux/stackleak.h
new file mode 100644
index 0000000..743c911
--- /dev/null
+++ b/include/linux/stackleak.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_STACKLEAK_H
+#define _LINUX_STACKLEAK_H
+
+#include <linux/sched.h>
+#include <linux/sched/task_stack.h>
+
+/*
+ * Check that the poison value points to the unused hole in the
+ * virtual memory map for your platform.
+ */
+#define STACKLEAK_POISON -0xBEEF
+
+#define STACKLEAK_POISON_CHECK_DEPTH 128
+
+static inline void stackleak_task_init(struct task_struct *task)
+{
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	task->lowest_stack = (unsigned long)end_of_stack(task) +
+						sizeof(unsigned long);
+#endif
+}
+
+#endif
diff --git a/kernel/Makefile b/kernel/Makefile
index 04bc07c..e5dc293 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -117,6 +117,10 @@ obj-$(CONFIG_HAS_IOMEM) += iomem.o
 obj-$(CONFIG_ZONE_DEVICE) += memremap.o
 obj-$(CONFIG_RSEQ) += rseq.o
 
+obj-$(CONFIG_GCC_PLUGIN_STACKLEAK) += stackleak.o
+KASAN_SANITIZE_stackleak.o := n
+KCOV_INSTRUMENT_stackleak.o := n
+
 $(obj)/configs.o: $(obj)/config_data.h
 
 targets += config_data.gz
diff --git a/kernel/fork.c b/kernel/fork.c
index 9440d61..1bc86ba 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -91,6 +91,7 @@
 #include <linux/kcov.h>
 #include <linux/livepatch.h>
 #include <linux/thread_info.h>
+#include <linux/stackleak.h>
 
 #include <asm/pgtable.h>
 #include <asm/pgalloc.h>
@@ -1813,6 +1814,8 @@ static __latent_entropy struct task_struct *copy_process(
 	if (retval)
 		goto bad_fork_cleanup_io;
 
+	stackleak_task_init(p);
+
 	if (pid != &init_struct_pid) {
 		pid = alloc_pid(p->nsproxy->pid_ns_for_children);
 		if (IS_ERR(pid)) {
diff --git a/kernel/stackleak.c b/kernel/stackleak.c
new file mode 100644
index 0000000..a84a161
--- /dev/null
+++ b/kernel/stackleak.c
@@ -0,0 +1,63 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This code fills the used part of the kernel stack with a poison value
+ * before returning to the userspace. It's a part of the STACKLEAK feature
+ * ported from grsecurity/PaX.
+ *
+ * Author: Alexander Popov <alex.popov@linux.com>
+ *
+ * STACKLEAK reduces the information which kernel stack leak bugs can
+ * reveal and blocks some uninitialized stack variable attacks. Moreover,
+ * STACKLEAK blocks stack depth overflow caused by alloca() (aka Stack Clash
+ * attack).
+ */
+
+#include <linux/stackleak.h>
+
+asmlinkage void stackleak_erase_kstack(void)
+{
+	/* It would be nice not to have 'kstack_ptr' and 'boundary' on stack */
+	unsigned long kstack_ptr = current->lowest_stack;
+	unsigned long boundary = kstack_ptr & ~(THREAD_SIZE - 1);
+	unsigned int poison_count = 0;
+	const unsigned int check_depth =
+			STACKLEAK_POISON_CHECK_DEPTH / sizeof(unsigned long);
+
+	/* Search for the poison value in the kernel stack */
+	while (kstack_ptr > boundary && poison_count <= check_depth) {
+		if (*(unsigned long *)kstack_ptr == STACKLEAK_POISON)
+			poison_count++;
+		else
+			poison_count = 0;
+
+		kstack_ptr -= sizeof(unsigned long);
+	}
+
+	/*
+	 * One 'long int' at the bottom of the thread stack is reserved and
+	 * should not be poisoned (see CONFIG_SCHED_STACK_END_CHECK).
+	 */
+	if (kstack_ptr == boundary)
+		kstack_ptr += sizeof(unsigned long);
+
+	/*
+	 * Now write the poison value to the kernel stack. Start from
+	 * 'kstack_ptr' and move up till the new 'boundary'. We assume that
+	 * the stack pointer doesn't change when we write poison.
+	 */
+	if (on_thread_stack())
+		boundary = current_stack_pointer;
+	else
+		boundary = current_top_of_stack();
+
+	BUG_ON(boundary - kstack_ptr >= THREAD_SIZE);
+
+	while (kstack_ptr < boundary) {
+		*(unsigned long *)kstack_ptr = STACKLEAK_POISON;
+		kstack_ptr += sizeof(unsigned long);
+	}
+
+	/* Reset the 'lowest_stack' value for the next syscall */
+	current->lowest_stack = current_top_of_stack() - THREAD_SIZE / 64;
+}
+
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v13 (resend) 3/6] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
  2018-06-22 16:58 [PATCH v13 (resend) 0/6] Introduce the STACKLEAK feature and a test for it Alexander Popov
  2018-06-22 16:58 ` [PATCH v13 (resend) 1/6] gcc-plugins: Clean up the cgraph_create_edge* macros Alexander Popov
  2018-06-22 16:58 ` [PATCH v13 (resend) 2/6] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls Alexander Popov
@ 2018-06-22 16:58 ` Alexander Popov
  2018-06-22 16:58 ` [PATCH v13 (resend) 4/6] lkdtm: Add a test for STACKLEAK Alexander Popov
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Alexander Popov @ 2018-06-22 16:58 UTC (permalink / raw)
  To: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Linus Torvalds,
	Greg Kroah-Hartman, Dan Williams, Dave Hansen, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon,
	Arnd Bergmann, Florian Weimer, Boris Lukashev, Andrey Konovalov,
	x86, linux-kernel, alex.popov

The STACKLEAK feature erases the kernel stack before returning from
syscalls. That reduces the information which kernel stack leak bugs can
reveal and blocks some uninitialized stack variable attacks. Moreover,
STACKLEAK blocks kernel stack depth overflow caused by alloca (aka
Stack Clash attack).

This commit introduces the STACKLEAK gcc plugin. It is needed for:
 - tracking the lowest border of the kernel stack, which is important
    for the code erasing the used part of the kernel stack at the end
    of syscalls (comes in a separate commit);
 - checking that alloca calls don't cause stack overflow.

So this plugin instruments the kernel code inserting:
 - the stackleak_check_alloca() call before alloca and the
    stackleak_track_stack() call after it;
 - the stackleak_track_stack() call for the functions with a stack frame
    size greater than or equal to CONFIG_STACKLEAK_TRACK_MIN_SIZE.

The STACKLEAK feature is ported from grsecurity/PaX. More information at:
  https://grsecurity.net/
  https://pax.grsecurity.net/

This code is modified from Brad Spengler/PaX Team's code in the last
public patch of grsecurity/PaX based on our understanding of the code.
Changes or omissions from the original code are ours and don't reflect
the original grsecurity/PaX code.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 arch/Kconfig                           |  14 +
 arch/x86/kernel/dumpstack.c            |  31 +++
 kernel/stackleak.c                     |  29 ++
 scripts/Makefile.gcc-plugins           |   3 +
 scripts/gcc-plugins/stackleak_plugin.c | 480 +++++++++++++++++++++++++++++++++
 5 files changed, 557 insertions(+)
 create mode 100644 scripts/gcc-plugins/stackleak_plugin.c

diff --git a/arch/Kconfig b/arch/Kconfig
index 57817f0..fe5303f 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -560,6 +560,8 @@ config GCC_PLUGIN_STACKLEAK
 	bool "Erase the kernel stack before returning from syscalls"
 	depends on GCC_PLUGINS
 	depends on HAVE_ARCH_STACKLEAK
+	imply VMAP_STACK
+	imply SCHED_STACK_END_CHECK
 	help
 	  This option makes the kernel erase the kernel stack before
 	  returning from system calls. That reduces the information which
@@ -576,6 +578,18 @@ config GCC_PLUGIN_STACKLEAK
 	   * https://grsecurity.net/
 	   * https://pax.grsecurity.net/
 
+config STACKLEAK_TRACK_MIN_SIZE
+	int "Minimum stack frame size of functions tracked by STACKLEAK"
+	default 100
+	range 0 4096
+	depends on GCC_PLUGIN_STACKLEAK
+	help
+	  The STACKLEAK gcc plugin instruments the kernel code for tracking
+	  the lowest border of the kernel stack (and for some other purposes).
+	  It inserts the stackleak_track_stack() call for the functions with
+	  a stack frame size greater than or equal to this parameter.
+	  If unsure, leave the default value 100.
+
 config HAVE_STACKPROTECTOR
 	bool
 	help
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 666a284..c3d70e0 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -418,3 +418,34 @@ void show_regs(struct pt_regs *regs)
 	if (!user_mode(regs))
 		show_trace_log_lvl(current, regs, NULL, KERN_DEFAULT);
 }
+
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+void __used stackleak_check_alloca(unsigned long size)
+{
+	unsigned long sp = (unsigned long)&sp;
+	struct stack_info stack_info = {0};
+	unsigned long visit_mask = 0;
+	unsigned long stack_left;
+
+	BUG_ON(get_stack_info(&sp, current, &stack_info, &visit_mask));
+
+	stack_left = sp - (unsigned long)stack_info.begin;
+
+	if (size >= stack_left) {
+		/*
+		 * Kernel stack depth overflow is detected, let's report that.
+		 * If CONFIG_VMAP_STACK is enabled, we can safely use BUG().
+		 * If CONFIG_VMAP_STACK is disabled, BUG() handling can corrupt
+		 * the neighbour memory. CONFIG_SCHED_STACK_END_CHECK calls
+		 * panic() in a similar situation, so let's do the same if that
+		 * option is on. Otherwise just use BUG() and hope for the best.
+		 */
+#if !defined(CONFIG_VMAP_STACK) && defined(CONFIG_SCHED_STACK_END_CHECK)
+		panic("alloca over the kernel stack boundary\n");
+#else
+		BUG();
+#endif
+	}
+}
+EXPORT_SYMBOL(stackleak_check_alloca);
+#endif
diff --git a/kernel/stackleak.c b/kernel/stackleak.c
index a84a161..7343f41 100644
--- a/kernel/stackleak.c
+++ b/kernel/stackleak.c
@@ -61,3 +61,32 @@ asmlinkage void stackleak_erase_kstack(void)
 	current->lowest_stack = current_top_of_stack() - THREAD_SIZE / 64;
 }
 
+void __used stackleak_track_stack(void)
+{
+	/*
+	 * N.B. stackleak_erase_kstack() fills the kernel stack with the poison
+	 * value, which has the register width. That code assumes that
+	 * the value of lowest_stack is aligned on the register width boundary.
+	 *
+	 * That is true for x86 and x86_64 because of the kernel stack
+	 * alignment on these platforms (for details, see cc_stack_align in
+	 * arch/x86/Makefile). Take care of that when you port STACKLEAK to
+	 * new platforms.
+	 */
+	unsigned long sp = (unsigned long)&sp;
+
+	/*
+	 * Having CONFIG_STACKLEAK_TRACK_MIN_SIZE larger than
+	 * STACKLEAK_POISON_CHECK_DEPTH makes the poison search in
+	 * stackleak_erase_kstack() unreliable. Let's prevent that.
+	 */
+	BUILD_BUG_ON(CONFIG_STACKLEAK_TRACK_MIN_SIZE >
+						STACKLEAK_POISON_CHECK_DEPTH);
+
+	if (sp < current->lowest_stack &&
+	    sp >= (unsigned long)task_stack_page(current) +
+						sizeof(unsigned long)) {
+		current->lowest_stack = sp;
+	}
+}
+EXPORT_SYMBOL(stackleak_track_stack);
diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins
index c961b9a..868356d 100644
--- a/scripts/Makefile.gcc-plugins
+++ b/scripts/Makefile.gcc-plugins
@@ -17,6 +17,9 @@ gcc-plugin-$(CONFIG_GCC_PLUGIN_RANDSTRUCT)	+= randomize_layout_plugin.so
 gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_RANDSTRUCT)	+= -DRANDSTRUCT_PLUGIN
 gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_RANDSTRUCT_PERFORMANCE)	+= -fplugin-arg-randomize_layout_plugin-performance-mode
 
+gcc-plugin-$(CONFIG_GCC_PLUGIN_STACKLEAK)	+= stackleak_plugin.so
+gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_STACKLEAK)	+= -DSTACKLEAK_PLUGIN -fplugin-arg-stackleak_plugin-track-min-size=$(CONFIG_STACKLEAK_TRACK_MIN_SIZE)
+
 GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y))
 
 export GCC_PLUGINS_CFLAGS GCC_PLUGIN GCC_PLUGIN_SUBDIR
diff --git a/scripts/gcc-plugins/stackleak_plugin.c b/scripts/gcc-plugins/stackleak_plugin.c
new file mode 100644
index 0000000..05c3719
--- /dev/null
+++ b/scripts/gcc-plugins/stackleak_plugin.c
@@ -0,0 +1,480 @@
+/*
+ * Copyright 2011-2017 by the PaX Team <pageexec@freemail.hu>
+ * Modified by Alexander Popov <alex.popov@linux.com>
+ * Licensed under the GPL v2
+ *
+ * Note: the choice of the license means that the compilation process is
+ * NOT 'eligible' as defined by gcc's library exception to the GPL v3,
+ * but for the kernel it doesn't matter since it doesn't link against
+ * any of the gcc libraries
+ *
+ * This gcc plugin is needed for tracking the lowest border of the kernel stack
+ * and checking that alloca calls don't cause stack overflow. It instruments
+ * the kernel code inserting:
+ *  - the stackleak_check_alloca() call before alloca and the
+ *     stackleak_track_stack() call after it;
+ *  - the stackleak_track_stack() call for the functions with a stack frame
+ *     size greater than or equal to the "track-min-size" plugin parameter.
+ *
+ * This plugin is ported from grsecurity/PaX. For more information see:
+ *   https://grsecurity.net/
+ *   https://pax.grsecurity.net/
+ *
+ * Debugging:
+ *  - use fprintf() to stderr, debug_generic_expr(), debug_gimple_stmt(),
+ *     print_rtl() and print_simple_rtl();
+ *  - add "-fdump-tree-all -fdump-rtl-all" to the plugin CFLAGS in
+ *     Makefile.gcc-plugins to see the verbose dumps of the gcc passes;
+ *  - use gcc -E to understand the preprocessing shenanigans;
+ *  - use gcc with enabled CFG/GIMPLE/SSA verification (--enable-checking).
+ */
+
+#include "gcc-common.h"
+
+__visible int plugin_is_GPL_compatible;
+
+static int track_frame_size = -1;
+static const char track_function[] = "stackleak_track_stack";
+static const char check_function[] = "stackleak_check_alloca";
+
+/*
+ * Mark these global variables (roots) for gcc garbage collector since
+ * they point to the garbage-collected memory.
+ */
+static GTY(()) tree track_function_decl;
+static GTY(()) tree check_function_decl;
+
+static struct plugin_info stackleak_plugin_info = {
+	.version = "201707101337",
+	.help = "track-min-size=nn\ttrack stack for functions with a stack frame size >= nn bytes\n"
+		"disable\t\tdo not activate the plugin\n"
+};
+
+static void stackleak_add_check_alloca(gimple_stmt_iterator *gsi)
+{
+	gimple stmt;
+	gcall *stackleak_check_alloca;
+	tree alloca_size;
+	cgraph_node_ptr node;
+	int frequency;
+	basic_block bb;
+
+	/* Insert call to void stackleak_check_alloca(unsigned long size) */
+	alloca_size = gimple_call_arg(gsi_stmt(*gsi), 0);
+	stmt = gimple_build_call(check_function_decl, 1, alloca_size);
+	stackleak_check_alloca = as_a_gcall(stmt);
+	gsi_insert_before(gsi, stackleak_check_alloca, GSI_SAME_STMT);
+
+	/* Update the cgraph */
+	bb = gimple_bb(stackleak_check_alloca);
+	node = cgraph_get_create_node(check_function_decl);
+	gcc_assert(node);
+	frequency = compute_call_stmt_bb_frequency(current_function_decl, bb);
+	cgraph_create_edge(cgraph_get_node(current_function_decl), node,
+			stackleak_check_alloca, bb->count, frequency);
+}
+
+static void stackleak_add_track_stack(gimple_stmt_iterator *gsi, bool after)
+{
+	gimple stmt;
+	gcall *stackleak_track_stack;
+	cgraph_node_ptr node;
+	int frequency;
+	basic_block bb;
+
+	/* Insert call to void stackleak_track_stack(void) */
+	stmt = gimple_build_call(track_function_decl, 0);
+	stackleak_track_stack = as_a_gcall(stmt);
+	if (after) {
+		gsi_insert_after(gsi, stackleak_track_stack,
+						GSI_CONTINUE_LINKING);
+	} else {
+		gsi_insert_before(gsi, stackleak_track_stack, GSI_SAME_STMT);
+	}
+
+	/* Update the cgraph */
+	bb = gimple_bb(stackleak_track_stack);
+	node = cgraph_get_create_node(track_function_decl);
+	gcc_assert(node);
+	frequency = compute_call_stmt_bb_frequency(current_function_decl, bb);
+	cgraph_create_edge(cgraph_get_node(current_function_decl), node,
+			stackleak_track_stack, bb->count, frequency);
+}
+
+static bool is_alloca(gimple stmt)
+{
+	if (gimple_call_builtin_p(stmt, BUILT_IN_ALLOCA))
+		return true;
+
+#if BUILDING_GCC_VERSION >= 4007
+	if (gimple_call_builtin_p(stmt, BUILT_IN_ALLOCA_WITH_ALIGN))
+		return true;
+#endif
+
+	return false;
+}
+
+/*
+ * Work with the GIMPLE representation of the code. Insert the
+ * stackleak_check_alloca() call before alloca and stackleak_track_stack() call
+ * after it. Also insert stackleak_track_stack() call into the beginning of
+ * the function if it is not instrumented.
+ */
+static unsigned int stackleak_instrument_execute(void)
+{
+	basic_block bb, entry_bb;
+	bool prologue_instrumented = false, is_leaf = true;
+	gimple_stmt_iterator gsi;
+
+	/*
+	 * ENTRY_BLOCK_PTR is a basic block which represents possible entry
+	 * point of a function. This block does not contain any code and
+	 * has a CFG edge to its successor.
+	 */
+	gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+	entry_bb = single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun));
+
+	/*
+	 * Loop through the GIMPLE statements in each of cfun basic blocks.
+	 * cfun is a global variable which represents the function that is
+	 * currently processed.
+	 */
+	FOR_EACH_BB_FN(bb, cfun) {
+		for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) {
+			gimple stmt;
+
+			stmt = gsi_stmt(gsi);
+
+			/* Leaf function is a function which makes no calls */
+			if (is_gimple_call(stmt))
+				is_leaf = false;
+
+			if (!is_alloca(stmt))
+				continue;
+
+			/* Insert stack overflow check before alloca */
+			stackleak_add_check_alloca(&gsi);
+
+			/* Insert stackleak_track_stack() call after alloca */
+			stackleak_add_track_stack(&gsi, true);
+			if (bb == entry_bb)
+				prologue_instrumented = true;
+		}
+	}
+
+	if (prologue_instrumented)
+		return 0;
+
+	/*
+	 * Special cases to skip the instrumentation.
+	 *
+	 * Taking the address of static inline functions materializes them,
+	 * but we mustn't instrument some of them as the resulting stack
+	 * alignment required by the function call ABI will break other
+	 * assumptions regarding the expected (but not otherwise enforced)
+	 * register clobbering ABI.
+	 *
+	 * Case in point: native_save_fl on amd64 when optimized for size
+	 * clobbers rdx if it were instrumented here.
+	 *
+	 * TODO: any more special cases?
+	 */
+	if (is_leaf &&
+	    !TREE_PUBLIC(current_function_decl) &&
+	    DECL_DECLARED_INLINE_P(current_function_decl)) {
+		return 0;
+	}
+
+	if (is_leaf &&
+	    !strncmp(IDENTIFIER_POINTER(DECL_NAME(current_function_decl)),
+		     "_paravirt_", 10)) {
+		return 0;
+	}
+
+	/* Insert stackleak_track_stack() call at the function beginning */
+	bb = entry_bb;
+	if (!single_pred_p(bb)) {
+		/* gcc_assert(bb_loop_depth(bb) ||
+				(bb->flags & BB_IRREDUCIBLE_LOOP)); */
+		split_edge(single_succ_edge(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+		gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+		bb = single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun));
+	}
+	gsi = gsi_after_labels(bb);
+	stackleak_add_track_stack(&gsi, false);
+
+	return 0;
+}
+
+static bool large_stack_frame(void)
+{
+#if BUILDING_GCC_VERSION >= 8000
+	return maybe_ge(get_frame_size(), track_frame_size);
+#else
+	return (get_frame_size() >= track_frame_size);
+#endif
+}
+
+/*
+ * Work with the RTL representation of the code.
+ * Remove the unneeded stackleak_track_stack() calls from the functions
+ * which don't call alloca and don't have a large enough stack frame size.
+ */
+static unsigned int stackleak_cleanup_execute(void)
+{
+	rtx_insn *insn, *next;
+
+	if (cfun->calls_alloca)
+		return 0;
+
+	if (large_stack_frame())
+		return 0;
+
+	/*
+	 * Find stackleak_track_stack() calls. Loop through the chain of insns,
+	 * which is an RTL representation of the code for a function.
+	 *
+	 * The example of a matching insn:
+	 *  (call_insn 8 4 10 2 (call (mem (symbol_ref ("stackleak_track_stack")
+	 *  [flags 0x41] <function_decl 0x7f7cd3302a80 stackleak_track_stack>)
+	 *  [0 stackleak_track_stack S1 A8]) (0)) 675 {*call} (expr_list
+	 *  (symbol_ref ("stackleak_track_stack") [flags 0x41] <function_decl
+	 *  0x7f7cd3302a80 stackleak_track_stack>) (expr_list (0) (nil))) (nil))
+	 */
+	for (insn = get_insns(); insn; insn = next) {
+		rtx body;
+
+		next = NEXT_INSN(insn);
+
+		/* Check the expression code of the insn */
+		if (!CALL_P(insn))
+			continue;
+
+		/*
+		 * Check the expression code of the insn body, which is an RTL
+		 * Expression (RTX) describing the side effect performed by
+		 * that insn.
+		 */
+		body = PATTERN(insn);
+
+		if (GET_CODE(body) == PARALLEL)
+			body = XVECEXP(body, 0, 0);
+
+		if (GET_CODE(body) != CALL)
+			continue;
+
+		/*
+		 * Check the first operand of the call expression. It should
+		 * be a mem RTX describing the needed subroutine with a
+		 * symbol_ref RTX.
+		 */
+		body = XEXP(body, 0);
+		if (GET_CODE(body) != MEM)
+			continue;
+
+		body = XEXP(body, 0);
+		if (GET_CODE(body) != SYMBOL_REF)
+			continue;
+
+		if (SYMBOL_REF_DECL(body) != track_function_decl)
+			continue;
+
+		/* Delete the stackleak_track_stack() call */
+		delete_insn_and_edges(insn);
+#if BUILDING_GCC_VERSION >= 4007 && BUILDING_GCC_VERSION < 8000
+		if (GET_CODE(next) == NOTE &&
+		    NOTE_KIND(next) == NOTE_INSN_CALL_ARG_LOCATION) {
+			insn = next;
+			next = NEXT_INSN(insn);
+			delete_insn_and_edges(insn);
+		}
+#endif
+	}
+
+	return 0;
+}
+
+static bool stackleak_gate(void)
+{
+	tree section;
+
+	section = lookup_attribute("section",
+				   DECL_ATTRIBUTES(current_function_decl));
+	if (section && TREE_VALUE(section)) {
+		section = TREE_VALUE(TREE_VALUE(section));
+
+		if (!strncmp(TREE_STRING_POINTER(section), ".init.text", 10))
+			return false;
+		if (!strncmp(TREE_STRING_POINTER(section), ".devinit.text", 13))
+			return false;
+		if (!strncmp(TREE_STRING_POINTER(section), ".cpuinit.text", 13))
+			return false;
+		if (!strncmp(TREE_STRING_POINTER(section), ".meminit.text", 13))
+			return false;
+	}
+
+	return track_frame_size >= 0;
+}
+
+/*
+ * Build function declarations for stackleak_track_stack() and
+ * stackleak_check_alloca().
+ */
+static void stackleak_start_unit(void *gcc_data __unused,
+				 void *user_data __unused)
+{
+	tree fntype;
+
+	/* void stackleak_track_stack(void) */
+	fntype = build_function_type_list(void_type_node, NULL_TREE);
+	track_function_decl = build_fn_decl(track_function, fntype);
+	DECL_ASSEMBLER_NAME(track_function_decl); /* for LTO */
+	TREE_PUBLIC(track_function_decl) = 1;
+	TREE_USED(track_function_decl) = 1;
+	DECL_EXTERNAL(track_function_decl) = 1;
+	DECL_ARTIFICIAL(track_function_decl) = 1;
+	DECL_PRESERVE_P(track_function_decl) = 1;
+
+	/* void stackleak_check_alloca(unsigned long) */
+	fntype = build_function_type_list(void_type_node,
+				long_unsigned_type_node, NULL_TREE);
+	check_function_decl = build_fn_decl(check_function, fntype);
+	DECL_ASSEMBLER_NAME(check_function_decl); /* for LTO */
+	TREE_PUBLIC(check_function_decl) = 1;
+	TREE_USED(check_function_decl) = 1;
+	DECL_EXTERNAL(check_function_decl) = 1;
+	DECL_ARTIFICIAL(check_function_decl) = 1;
+	DECL_PRESERVE_P(check_function_decl) = 1;
+}
+
+/*
+ * Pass gate function is a predicate function that gets executed before the
+ * corresponding pass. If the return value is 'true' the pass gets executed,
+ * otherwise, it is skipped.
+ */
+static bool stackleak_instrument_gate(void)
+{
+	return stackleak_gate();
+}
+
+#define PASS_NAME stackleak_instrument
+#define PROPERTIES_REQUIRED PROP_gimple_leh | PROP_cfg
+#define TODO_FLAGS_START TODO_verify_ssa | TODO_verify_flow | TODO_verify_stmts
+#define TODO_FLAGS_FINISH TODO_verify_ssa | TODO_verify_stmts | TODO_dump_func \
+			| TODO_update_ssa | TODO_rebuild_cgraph_edges
+#include "gcc-generate-gimple-pass.h"
+
+static bool stackleak_cleanup_gate(void)
+{
+	return stackleak_gate();
+}
+
+#define PASS_NAME stackleak_cleanup
+#define TODO_FLAGS_FINISH TODO_dump_func
+#include "gcc-generate-rtl-pass.h"
+
+/*
+ * Every gcc plugin exports a plugin_init() function that is called right
+ * after the plugin is loaded. This function is responsible for registering
+ * the plugin callbacks and doing other required initialization.
+ */
+__visible int plugin_init(struct plugin_name_args *plugin_info,
+			  struct plugin_gcc_version *version)
+{
+	const char * const plugin_name = plugin_info->base_name;
+	const int argc = plugin_info->argc;
+	const struct plugin_argument * const argv = plugin_info->argv;
+	int i = 0;
+
+	/* Extra GGC root tables describing our GTY-ed data */
+	static const struct ggc_root_tab gt_ggc_r_gt_stackleak[] = {
+		{
+			.base = &track_function_decl,
+			.nelt = 1,
+			.stride = sizeof(track_function_decl),
+			.cb = &gt_ggc_mx_tree_node,
+			.pchw = &gt_pch_nx_tree_node
+		},
+		{
+			.base = &check_function_decl,
+			.nelt = 1,
+			.stride = sizeof(check_function_decl),
+			.cb = &gt_ggc_mx_tree_node,
+			.pchw = &gt_pch_nx_tree_node
+		},
+		LAST_GGC_ROOT_TAB
+	};
+
+	/*
+	 * The stackleak_instrument pass should be executed before the
+	 * "optimized" pass, which is the control flow graph cleanup that is
+	 * performed just before expanding gcc trees to the RTL. In former
+	 * versions of the plugin this new pass was inserted before the
+	 * "tree_profile" pass, which is currently called "profile".
+	 */
+	PASS_INFO(stackleak_instrument, "optimized", 1,
+						PASS_POS_INSERT_BEFORE);
+
+	/*
+	 * The stackleak_cleanup pass should be executed after the
+	 * "reload" pass, when the stack frame size is final.
+	 */
+	PASS_INFO(stackleak_cleanup, "reload", 1, PASS_POS_INSERT_AFTER);
+
+	if (!plugin_default_version_check(version, &gcc_version)) {
+		error(G_("incompatible gcc/plugin versions"));
+		return 1;
+	}
+
+	/* Parse the plugin arguments */
+	for (i = 0; i < argc; i++) {
+		if (!strcmp(argv[i].key, "disable"))
+			return 0;
+
+		if (!strcmp(argv[i].key, "track-min-size")) {
+			if (!argv[i].value) {
+				error(G_("no value supplied for option '-fplugin-arg-%s-%s'"),
+					plugin_name, argv[i].key);
+				return 1;
+			}
+
+			track_frame_size = atoi(argv[i].value);
+			if (track_frame_size < 0) {
+				error(G_("invalid option argument '-fplugin-arg-%s-%s=%s'"),
+					plugin_name, argv[i].key, argv[i].value);
+				return 1;
+			}
+		} else {
+			error(G_("unknown option '-fplugin-arg-%s-%s'"),
+					plugin_name, argv[i].key);
+			return 1;
+		}
+	}
+
+	/* Give the information about the plugin */
+	register_callback(plugin_name, PLUGIN_INFO, NULL,
+						&stackleak_plugin_info);
+
+	/* Register to be called before processing a translation unit */
+	register_callback(plugin_name, PLUGIN_START_UNIT,
+					&stackleak_start_unit, NULL);
+
+	/* Register an extra GCC garbage collector (GGC) root table */
+	register_callback(plugin_name, PLUGIN_REGISTER_GGC_ROOTS, NULL,
+					(void *)&gt_ggc_r_gt_stackleak);
+
+	/*
+	 * Hook into the Pass Manager to register new gcc passes.
+	 *
+	 * The stack frame size info is available only at the last RTL pass,
+	 * when it's too late to insert complex code like a function call.
+	 * So we register two gcc passes to instrument every function at first
+	 * and remove the unneeded instrumentation later.
+	 */
+	register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL,
+					&stackleak_instrument_pass_info);
+	register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL,
+					&stackleak_cleanup_pass_info);
+
+	return 0;
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v13 (resend) 4/6] lkdtm: Add a test for STACKLEAK
  2018-06-22 16:58 [PATCH v13 (resend) 0/6] Introduce the STACKLEAK feature and a test for it Alexander Popov
                   ` (2 preceding siblings ...)
  2018-06-22 16:58 ` [PATCH v13 (resend) 3/6] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack Alexander Popov
@ 2018-06-22 16:58 ` Alexander Popov
  2018-06-22 16:58 ` [PATCH v13 (resend) 5/6] fs/proc: Show STACKLEAK metrics in the /proc file system Alexander Popov
  2018-06-22 16:58 ` [PATCH v13 (resend) 6/6] doc: self-protection: Add information about STACKLEAK feature Alexander Popov
  5 siblings, 0 replies; 17+ messages in thread
From: Alexander Popov @ 2018-06-22 16:58 UTC (permalink / raw)
  To: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Linus Torvalds,
	Greg Kroah-Hartman, Dan Williams, Dave Hansen, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon,
	Arnd Bergmann, Florian Weimer, Boris Lukashev, Andrey Konovalov,
	x86, linux-kernel, alex.popov

Introduce lkdtm tests for the STACKLEAK feature.

First, all of them check that the current task stack is properly erased
(filled with STACKLEAK_POISON).

STACKLEAK_DEEP_RECURSION tests that exhausting the current task stack
with deep recursion is detected by CONFIG_VMAP_STACK (which is implied
by CONFIG_GCC_PLUGIN_STACKLEAK).

STACKLEAK_BIG_ALLOCA and STACKLEAK_RECURSION_WITH_ALLOCA test that
alloca calls which overflow the kernel stack hit BUG()/panic() in
stackleak_check_alloca().

Signed-off-by: Alexander Popov <alex.popov@linux.com>
Signed-off-by: Tycho Andersen <tycho@tycho.ws>
---
 drivers/misc/lkdtm/Makefile    |   2 +
 drivers/misc/lkdtm/core.c      |   3 +
 drivers/misc/lkdtm/lkdtm.h     |   5 ++
 drivers/misc/lkdtm/stackleak.c | 146 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 156 insertions(+)
 create mode 100644 drivers/misc/lkdtm/stackleak.c

diff --git a/drivers/misc/lkdtm/Makefile b/drivers/misc/lkdtm/Makefile
index 3370a41..951c984 100644
--- a/drivers/misc/lkdtm/Makefile
+++ b/drivers/misc/lkdtm/Makefile
@@ -8,7 +8,9 @@ lkdtm-$(CONFIG_LKDTM)		+= perms.o
 lkdtm-$(CONFIG_LKDTM)		+= refcount.o
 lkdtm-$(CONFIG_LKDTM)		+= rodata_objcopy.o
 lkdtm-$(CONFIG_LKDTM)		+= usercopy.o
+lkdtm-$(CONFIG_LKDTM)		+= stackleak.o
 
+KASAN_SANITIZE_stackleak.o	:= n
 KCOV_INSTRUMENT_rodata.o	:= n
 
 OBJCOPYFLAGS :=
diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c
index 2154d1b..9d0324a 100644
--- a/drivers/misc/lkdtm/core.c
+++ b/drivers/misc/lkdtm/core.c
@@ -183,6 +183,9 @@ static const struct crashtype crashtypes[] = {
 	CRASHTYPE(USERCOPY_STACK_FRAME_FROM),
 	CRASHTYPE(USERCOPY_STACK_BEYOND),
 	CRASHTYPE(USERCOPY_KERNEL),
+	CRASHTYPE(STACKLEAK_BIG_ALLOCA),
+	CRASHTYPE(STACKLEAK_DEEP_RECURSION),
+	CRASHTYPE(STACKLEAK_RECURSION_WITH_ALLOCA),
 };
 
 
diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h
index 9e513dc..865a6c3 100644
--- a/drivers/misc/lkdtm/lkdtm.h
+++ b/drivers/misc/lkdtm/lkdtm.h
@@ -83,4 +83,9 @@ void lkdtm_USERCOPY_STACK_FRAME_FROM(void);
 void lkdtm_USERCOPY_STACK_BEYOND(void);
 void lkdtm_USERCOPY_KERNEL(void);
 
+/* lkdtm_stackleak.c */
+void lkdtm_STACKLEAK_BIG_ALLOCA(void);
+void lkdtm_STACKLEAK_DEEP_RECURSION(void);
+void lkdtm_STACKLEAK_RECURSION_WITH_ALLOCA(void);
+
 #endif
diff --git a/drivers/misc/lkdtm/stackleak.c b/drivers/misc/lkdtm/stackleak.c
new file mode 100644
index 0000000..38963ff
--- /dev/null
+++ b/drivers/misc/lkdtm/stackleak.c
@@ -0,0 +1,146 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This code tests several aspects of the STACKLEAK feature:
+ *  - the current task stack is properly erased (filled with STACKLEAK_POISON);
+ *  - exhausting the current task stack with deep recursion is detected by
+ *     CONFIG_VMAP_STACK (which is implied by CONFIG_GCC_PLUGIN_STACKLEAK);
+ *  - alloca calls which overflow the kernel stack hit BUG()/panic() in
+ *     stackleak_check_alloca().
+ *
+ * Authors:
+ *   Alexander Popov <alex.popov@linux.com>
+ *   Tycho Andersen <tycho@tycho.ws>
+ */
+
+#include "lkdtm.h"
+#include <linux/stackleak.h>
+
+static noinline bool stack_is_erased(void)
+{
+	unsigned long *sp, left, found, i;
+	const unsigned long check_depth = STACKLEAK_POISON_CHECK_DEPTH /
+							sizeof(unsigned long);
+
+	/*
+	 * For the details about the alignment of the poison values, see
+	 * the comment in stackleak_track_stack().
+	 */
+	sp = PTR_ALIGN(&i, sizeof(unsigned long));
+
+	left = ((unsigned long)sp & (THREAD_SIZE - 1)) / sizeof(unsigned long);
+	sp--;
+
+	/*
+	 * One 'long int' at the bottom of the thread stack is reserved
+	 * and not poisoned.
+	 */
+	if (left > 1)
+		left--;
+	else
+		return false;
+
+	pr_info("checking unused part of the thread stack (%lu bytes)...\n",
+					left * sizeof(unsigned long));
+
+	/*
+	 * Search for 'check_depth' poison values in a row (just like
+	 * stackleak_erase_kstack() does).
+	 */
+	for (i = 0, found = 0; i < left && found <= check_depth; i++) {
+		if (*(sp - i) == STACKLEAK_POISON)
+			found++;
+		else
+			found = 0;
+	}
+
+	if (found <= check_depth) {
+		pr_err("FAIL: thread stack is not erased (checked %lu bytes)\n",
+						i * sizeof(unsigned long));
+		return false;
+	}
+
+	pr_info("first %lu bytes are unpoisoned\n",
+				(i - found) * sizeof(unsigned long));
+
+	/* The rest of thread stack should be erased */
+	for (; i < left; i++) {
+		if (*(sp - i) != STACKLEAK_POISON) {
+			pr_err("FAIL: thread stack is NOT properly erased\n");
+			return false;
+		}
+	}
+
+	pr_info("the rest of the thread stack is properly erased\n");
+	return true;
+}
+
+static noinline void do_alloca(unsigned long size)
+{
+	char buf[size];
+
+	/* So this doesn't get inlined or optimized out */
+	snprintf(buf, size, "testing alloca...\n");
+}
+
+void lkdtm_STACKLEAK_BIG_ALLOCA(void)
+{
+	if (!stack_is_erased())
+		return;
+
+	pr_info("try a small alloca of 16 bytes...\n");
+	do_alloca(16);
+	pr_info("small alloca is successful\n");
+
+	pr_info("try alloca over the thread stack boundary...\n");
+	do_alloca(THREAD_SIZE);
+	pr_err("FAIL: alloca over the thread stack boundary is not detected\n");
+}
+
+static noinline unsigned long recursion(unsigned long prev_sp, bool with_alloca)
+{
+	char buf[400];
+	unsigned long sp = (unsigned long)&sp;
+
+	snprintf(buf, sizeof(buf), "testing deep recursion...\n");
+
+	if (with_alloca)
+		do_alloca(400);
+
+	if (prev_sp < sp + THREAD_SIZE)
+		sp = recursion(prev_sp, with_alloca);
+
+	return sp;
+}
+
+void lkdtm_STACKLEAK_DEEP_RECURSION(void)
+{
+	unsigned long sp = (unsigned long)&sp;
+
+	if (!stack_is_erased())
+		return;
+
+	/*
+	 * Overflow the thread stack using deep recursion. It should hit the
+	 * guard page provided by CONFIG_VMAP_STACK (which is implied by
+	 * CONFIG_GCC_PLUGIN_STACKLEAK).
+	 */
+	pr_info("try to overflow the thread stack using deep recursion...\n");
+	pr_err("FAIL: stack depth overflow (%lu bytes) is not detected\n",
+							sp - recursion(sp, 0));
+}
+
+void lkdtm_STACKLEAK_RECURSION_WITH_ALLOCA(void)
+{
+	unsigned long sp = (unsigned long)&sp;
+
+	if (!stack_is_erased())
+		return;
+
+	/*
+	 * Overflow the thread stack using deep recursion with alloca.
+	 * It should hit BUG()/panic() in stackleak_check_alloca().
+	 */
+	pr_info("try to overflow the thread stack using recursion & alloca\n");
+	recursion(sp, 1);
+	pr_err("FAIL: stack depth overflow is not detected\n");
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v13 (resend) 5/6] fs/proc: Show STACKLEAK metrics in the /proc file system
  2018-06-22 16:58 [PATCH v13 (resend) 0/6] Introduce the STACKLEAK feature and a test for it Alexander Popov
                   ` (3 preceding siblings ...)
  2018-06-22 16:58 ` [PATCH v13 (resend) 4/6] lkdtm: Add a test for STACKLEAK Alexander Popov
@ 2018-06-22 16:58 ` Alexander Popov
  2018-06-22 16:58 ` [PATCH v13 (resend) 6/6] doc: self-protection: Add information about STACKLEAK feature Alexander Popov
  5 siblings, 0 replies; 17+ messages in thread
From: Alexander Popov @ 2018-06-22 16:58 UTC (permalink / raw)
  To: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Linus Torvalds,
	Greg Kroah-Hartman, Dan Williams, Dave Hansen, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon,
	Arnd Bergmann, Florian Weimer, Boris Lukashev, Andrey Konovalov,
	x86, linux-kernel, alex.popov

Introduce CONFIG_STACKLEAK_METRICS providing STACKLEAK information about
tasks via the /proc file system. In particular, /proc/<pid>/stack_depth
shows the maximum kernel stack consumption for the current and previous
syscalls. Although this information is not precise, it can be useful for
estimating the STACKLEAK performance impact for your workloads.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 arch/Kconfig              | 12 ++++++++++++
 fs/proc/base.c            | 18 ++++++++++++++++++
 include/linux/sched.h     |  1 +
 include/linux/stackleak.h |  3 +++
 kernel/stackleak.c        |  4 ++++
 5 files changed, 38 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index fe5303f..cdffe30 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -590,6 +590,18 @@ config STACKLEAK_TRACK_MIN_SIZE
 	  a stack frame size greater than or equal to this parameter.
 	  If unsure, leave the default value 100.
 
+config STACKLEAK_METRICS
+	bool "Show STACKLEAK metrics in the /proc file system"
+	depends on GCC_PLUGIN_STACKLEAK
+	depends on PROC_FS
+	help
+	  If this is set, STACKLEAK metrics for every task are available in
+	  the /proc file system. In particular, /proc/<pid>/stack_depth
+	  shows the maximum kernel stack consumption for the current and
+	  previous syscalls. Although this information is not precise, it
+	  can be useful for estimating the STACKLEAK performance impact for
+	  your workloads.
+
 config HAVE_STACKPROTECTOR
 	bool
 	help
diff --git a/fs/proc/base.c b/fs/proc/base.c
index aaffc0c..66ee5b7 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2893,6 +2893,21 @@ static int proc_pid_patch_state(struct seq_file *m, struct pid_namespace *ns,
 }
 #endif /* CONFIG_LIVEPATCH */
 
+#ifdef CONFIG_STACKLEAK_METRICS
+static int proc_stack_depth(struct seq_file *m, struct pid_namespace *ns,
+				struct pid *pid, struct task_struct *task)
+{
+	unsigned long prev_depth = THREAD_SIZE -
+				(task->prev_lowest_stack & (THREAD_SIZE - 1));
+	unsigned long depth = THREAD_SIZE -
+				(task->lowest_stack & (THREAD_SIZE - 1));
+
+	seq_printf(m, "previous stack depth: %lu\nstack depth: %lu\n",
+							prev_depth, depth);
+	return 0;
+}
+#endif /* CONFIG_STACKLEAK_METRICS */
+
 /*
  * Thread groups
  */
@@ -2994,6 +3009,9 @@ static const struct pid_entry tgid_base_stuff[] = {
 #ifdef CONFIG_LIVEPATCH
 	ONE("patch_state",  S_IRUSR, proc_pid_patch_state),
 #endif
+#ifdef CONFIG_STACKLEAK_METRICS
+	ONE("stack_depth", S_IRUGO, proc_stack_depth),
+#endif
 };
 
 static int proc_tgid_base_readdir(struct file *file, struct dir_context *ctx)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 89c4cba..b9960b6 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1182,6 +1182,7 @@ struct task_struct {
 
 #ifdef CONFIG_GCC_PLUGIN_STACKLEAK
 	unsigned long			lowest_stack;
+	unsigned long			prev_lowest_stack;
 #endif
 
 	/*
diff --git a/include/linux/stackleak.h b/include/linux/stackleak.h
index 743c911..e2da99b 100644
--- a/include/linux/stackleak.h
+++ b/include/linux/stackleak.h
@@ -18,6 +18,9 @@ static inline void stackleak_task_init(struct task_struct *task)
 #ifdef CONFIG_GCC_PLUGIN_STACKLEAK
 	task->lowest_stack = (unsigned long)end_of_stack(task) +
 						sizeof(unsigned long);
+# ifdef CONFIG_STACKLEAK_METRICS
+	task->prev_lowest_stack = task->lowest_stack;
+# endif
 #endif
 }
 
diff --git a/kernel/stackleak.c b/kernel/stackleak.c
index 7343f41..8cb29fb 100644
--- a/kernel/stackleak.c
+++ b/kernel/stackleak.c
@@ -40,6 +40,10 @@ asmlinkage void stackleak_erase_kstack(void)
 	if (kstack_ptr == boundary)
 		kstack_ptr += sizeof(unsigned long);
 
+#ifdef CONFIG_STACKLEAK_METRICS
+	current->prev_lowest_stack = kstack_ptr;
+#endif
+
 	/*
 	 * Now write the poison value to the kernel stack. Start from
 	 * 'kstack_ptr' and move up till the new 'boundary'. We assume that
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v13 (resend) 6/6] doc: self-protection: Add information about STACKLEAK feature
  2018-06-22 16:58 [PATCH v13 (resend) 0/6] Introduce the STACKLEAK feature and a test for it Alexander Popov
                   ` (4 preceding siblings ...)
  2018-06-22 16:58 ` [PATCH v13 (resend) 5/6] fs/proc: Show STACKLEAK metrics in the /proc file system Alexander Popov
@ 2018-06-22 16:58 ` Alexander Popov
  5 siblings, 0 replies; 17+ messages in thread
From: Alexander Popov @ 2018-06-22 16:58 UTC (permalink / raw)
  To: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Linus Torvalds,
	Greg Kroah-Hartman, Dan Williams, Dave Hansen, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon,
	Arnd Bergmann, Florian Weimer, Boris Lukashev, Andrey Konovalov,
	x86, linux-kernel, alex.popov

Add information about STACKLEAK feature to "Stack depth overflow" and
"Memory poisoning" sections of self-protection.rst.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 Documentation/security/self-protection.rst | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/Documentation/security/self-protection.rst b/Documentation/security/self-protection.rst
index e1ca698..452bc75 100644
--- a/Documentation/security/self-protection.rst
+++ b/Documentation/security/self-protection.rst
@@ -165,10 +165,15 @@ Stack depth overflow
 A less well understood attack is using a bug that triggers the
 kernel to consume stack memory with deep function calls or large stack
 allocations. With this attack it is possible to write beyond the end of
-the kernel's preallocated stack space and into sensitive structures. Two
-important changes need to be made for better protections: moving the
-sensitive thread_info structure elsewhere, and adding a faulting memory
-hole at the bottom of the stack to catch these overflows.
+the kernel's preallocated stack space and into sensitive structures.
+The combination of the following measures gives better protection:
+
+* moving the sensitive thread_info structure off the stack
+  (``CONFIG_THREAD_INFO_IN_TASK``);
+* adding a faulting memory hole at the bottom of the stack to catch
+  these overflows (``CONFIG_VMAP_STACK``);
+* runtime checking that alloca() calls don't overstep the stack boundary
+  (``CONFIG_GCC_PLUGIN_STACKLEAK``).
 
 Heap memory integrity
 ---------------------
@@ -302,11 +307,11 @@ sure structure holes are cleared.
 Memory poisoning
 ----------------
 
-When releasing memory, it is best to poison the contents (clear stack on
-syscall return, wipe heap memory on a free), to avoid reuse attacks that
-rely on the old contents of memory. This frustrates many uninitialized
-variable attacks, stack content exposures, heap content exposures, and
-use-after-free attacks.
+When releasing memory, it is best to poison the contents, to avoid reuse
+attacks that rely on the old contents of memory. E.g., clear stack on a
+syscall return (``CONFIG_GCC_PLUGIN_STACKLEAK``), wipe heap memory on a
+free. This frustrates many uninitialized variable attacks, stack content
+exposures, heap content exposures, and use-after-free attacks.
 
 Destination tracking
 --------------------
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v13 (resend) 2/6] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-06-22 16:58 ` [PATCH v13 (resend) 2/6] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls Alexander Popov
@ 2018-06-22 18:58   ` Laura Abbott
  2018-06-27  0:13     ` Kees Cook
  2018-07-05  8:12   ` Ingo Molnar
  1 sibling, 1 reply; 17+ messages in thread
From: Laura Abbott @ 2018-06-22 18:58 UTC (permalink / raw)
  To: Alexander Popov, kernel-hardening, Kees Cook, PaX Team,
	Brad Spengler, Ingo Molnar, Andy Lutomirski, Tycho Andersen,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Linus Torvalds,
	Greg Kroah-Hartman, Dan Williams, Dave Hansen, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon,
	Arnd Bergmann, Florian Weimer, Boris Lukashev, Andrey Konovalov,
	x86, linux-kernel

On 06/22/2018 09:58 AM, Alexander Popov wrote:
> +	/*
> +	 * Now write the poison value to the kernel stack. Start from
> +	 * 'kstack_ptr' and move up till the new 'boundary'. We assume that
> +	 * the stack pointer doesn't change when we write poison.
> +	 */
> +	if (on_thread_stack())
> +		boundary = current_stack_pointer;
> +	else
> +		boundary = current_top_of_stack();
> +
> +	BUG_ON(boundary - kstack_ptr >= THREAD_SIZE);
> +
> +	while (kstack_ptr < boundary) {
> +		*(unsigned long *)kstack_ptr = STACKLEAK_POISON;
> +		kstack_ptr += sizeof(unsigned long);
> +	}
> +
> +	/* Reset the 'lowest_stack' value for the next syscall */
> +	current->lowest_stack = current_top_of_stack() - THREAD_SIZE / 64;

on_thread_stack() and current_top_of_stack() are x86 only
functions currently defined in asm/processor.h. There are
similar functions in arch/arm64/include/asm/stacktrace.h
which I think I can use to build those functions. Should
I just throw the arm64 versions in processor.h or do
we want to consider abstracting these into something like
asm/stackleak.h? I'd like to know a direction before I
start ripping apart stacktrace.h.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v13 (resend) 2/6] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-06-22 18:58   ` Laura Abbott
@ 2018-06-27  0:13     ` Kees Cook
  2018-06-27  0:21       ` Kees Cook
  0 siblings, 1 reply; 17+ messages in thread
From: Kees Cook @ 2018-06-27  0:13 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Alexander Popov, Kernel Hardening, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Linus Torvalds,
	Greg Kroah-Hartman, Dan Williams, Dave Hansen, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon,
	Arnd Bergmann, Florian Weimer, Boris Lukashev, Andrey Konovalov,
	X86 ML, LKML

On Fri, Jun 22, 2018 at 11:58 AM, Laura Abbott <labbott@redhat.com> wrote:
> On 06/22/2018 09:58 AM, Alexander Popov wrote:
>>
>> +       /*
>> +        * Now write the poison value to the kernel stack. Start from
>> +        * 'kstack_ptr' and move up till the new 'boundary'. We assume
>> that
>> +        * the stack pointer doesn't change when we write poison.
>> +        */
>> +       if (on_thread_stack())
>> +               boundary = current_stack_pointer;
>> +       else
>> +               boundary = current_top_of_stack();
>> +
>> +       BUG_ON(boundary - kstack_ptr >= THREAD_SIZE);
>> +
>> +       while (kstack_ptr < boundary) {
>> +               *(unsigned long *)kstack_ptr = STACKLEAK_POISON;
>> +               kstack_ptr += sizeof(unsigned long);
>> +       }
>> +
>> +       /* Reset the 'lowest_stack' value for the next syscall */
>> +       current->lowest_stack = current_top_of_stack() - THREAD_SIZE / 64;
>
>
> on_thread_stack() and current_top_of_stack() are x86 only
> functions currently defined in asm/processor.h. There are
> similar functions in arch/arm64/include/asm/stacktrace.h
> which I think I can use to build those functions. Should
> I just throw the arm64 versions in processor.h or do
> we want to consider abstracting these into something like
> asm/stackleak.h? I'd like to know a direction before I
> start ripping apart stacktrace.h.

static inline unsigned long current_top_of_stack(void)
{
        /*
         *  We can't read directly from tss.sp0: sp0 on x86_32 is special in
         *  and around vm86 mode and sp0 on x86_64 is special because of the
         *  entry trampoline.
         */
        return this_cpu_read_stable(cpu_current_top_of_stack);
}

static inline bool on_thread_stack(void)
{
        return (unsigned long)(current_top_of_stack() -
                               current_stack_pointer) < THREAD_SIZE;
}

The only arch-specific part is current_top_of_stack(), so it seems
like just solving current_top_of_stack() and copying on_thread_stack()
directly should work, yes? I would think sticking those directly into
asm/processor.h on arm64 would be okay?

FWIW, I don't think a separate asm/stackleak.h is needed.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v13 (resend) 2/6] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-06-27  0:13     ` Kees Cook
@ 2018-06-27  0:21       ` Kees Cook
  2018-06-27  0:55         ` Laura Abbott
  0 siblings, 1 reply; 17+ messages in thread
From: Kees Cook @ 2018-06-27  0:21 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Alexander Popov, Kernel Hardening, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Linus Torvalds,
	Greg Kroah-Hartman, Dan Williams, Dave Hansen, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon,
	Arnd Bergmann, Florian Weimer, Boris Lukashev, Andrey Konovalov,
	X86 ML, LKML

On Tue, Jun 26, 2018 at 5:13 PM, Kees Cook <keescook@chromium.org> wrote:
> The only arch-specific part is current_top_of_stack(), so it seems
> like just solving current_top_of_stack() and copying on_thread_stack()
> directly should work, yes? I would think sticking those directly into
> asm/processor.h on arm64 would be okay?

The difference in the earlier x86 and arm64 erase_kstack was:

x86:
        if (on_thread_stack())
                boundary = current_stack_pointer;
        else
                boundary = current_top_of_stack();

arm64:
        boundary = current_stack_pointer;

If a sane current_top_of_stack() isn't possible on arm64, I'm not
opposed to a small bit of arch-specific ifdef in there:

        boundary = get_stack_boundary();

static inline unsigned long get_stack_boundary(void)
{
#ifdef CONFIG_X86
        if (!on_thread_stack())
                return current_top_of_stack();
#else
         return current_stack_pointer;
}

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v13 (resend) 2/6] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-06-27  0:21       ` Kees Cook
@ 2018-06-27  0:55         ` Laura Abbott
  0 siblings, 0 replies; 17+ messages in thread
From: Laura Abbott @ 2018-06-27  0:55 UTC (permalink / raw)
  To: Kees Cook
  Cc: Alexander Popov, Kernel Hardening, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Linus Torvalds,
	Greg Kroah-Hartman, Dan Williams, Dave Hansen, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon,
	Arnd Bergmann, Florian Weimer, Boris Lukashev, Andrey Konovalov,
	X86 ML, LKML

On 06/26/2018 05:21 PM, Kees Cook wrote:
> On Tue, Jun 26, 2018 at 5:13 PM, Kees Cook <keescook@chromium.org> wrote:
>> The only arch-specific part is current_top_of_stack(), so it seems
>> like just solving current_top_of_stack() and copying on_thread_stack()
>> directly should work, yes? I would think sticking those directly into
>> asm/processor.h on arm64 would be okay?
> 
> The difference in the earlier x86 and arm64 erase_kstack was:
> 
> x86:
>          if (on_thread_stack())
>                  boundary = current_stack_pointer;
>          else
>                  boundary = current_top_of_stack();
> 
> arm64:
>          boundary = current_stack_pointer;
> 
> If a sane current_top_of_stack() isn't possible on arm64, I'm not
> opposed to a small bit of arch-specific ifdef in there:
> 
>          boundary = get_stack_boundary();
> 
> static inline unsigned long get_stack_boundary(void)
> {
> #ifdef CONFIG_X86
>          if (!on_thread_stack())
>                  return current_top_of_stack();
> #else
>           return current_stack_pointer;
> }
> 
> -Kees
> 

I _think_ I have reasonable implementations for both functions,
I was asking about pulling stuff out into a new header because
I wanted to see about reusing on_task_stack which was defined
in asm/stacktrace.h and I was concerned about include nightmares
with asm/processor.h . I'll see what I come up with.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v13 (resend) 2/6] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-06-22 16:58 ` [PATCH v13 (resend) 2/6] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls Alexander Popov
  2018-06-22 18:58   ` Laura Abbott
@ 2018-07-05  8:12   ` Ingo Molnar
  2018-07-05 10:00     ` Peter Zijlstra
  2018-07-05 21:55     ` Alexander Popov
  1 sibling, 2 replies; 17+ messages in thread
From: Ingo Molnar @ 2018-07-05  8:12 UTC (permalink / raw)
  To: Alexander Popov
  Cc: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Andy Lutomirski, Tycho Andersen, Laura Abbott, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Linus Torvalds,
	Greg Kroah-Hartman, Dan Williams, Dave Hansen, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon,
	Arnd Bergmann, Florian Weimer, Boris Lukashev, Andrey Konovalov,
	x86, linux-kernel


* Alexander Popov <alex.popov@linux.com> wrote:

> The STACKLEAK feature erases the kernel stack before returning from
> syscalls. That reduces the information which kernel stack leak bugs can
> reveal and blocks some uninitialized stack variable attacks. Moreover,
> STACKLEAK blocks kernel stack depth overflow caused by alloca (aka
> Stack Clash attack).
> 
> This commit introduces the code filling the used part of the kernel
> stack with a poison value before returning to the userspace. Full
> STACKLEAK feature also contains the gcc plugin which comes in a
> separate commit.
> 
> The STACKLEAK feature is ported from grsecurity/PaX. More information at:
>   https://grsecurity.net/
>   https://pax.grsecurity.net/
> 
> This code is modified from Brad Spengler/PaX Team's code in the last
> public patch of grsecurity/PaX based on our understanding of the code.
> Changes or omissions from the original code are ours and don't reflect
> the original grsecurity/PaX code.
> 
> Signed-off-by: Alexander Popov <alex.popov@linux.com>
> Acked-by: Thomas Gleixner <tglx@linutronix.de>
> Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
> ---
>  Documentation/x86/x86_64/mm.txt  |  2 ++
>  arch/Kconfig                     | 27 +++++++++++++++++
>  arch/x86/Kconfig                 |  1 +
>  arch/x86/entry/calling.h         | 14 +++++++++
>  arch/x86/entry/entry_32.S        |  7 +++++
>  arch/x86/entry/entry_64.S        |  3 ++
>  arch/x86/entry/entry_64_compat.S |  5 ++++
>  include/linux/sched.h            |  4 +++
>  include/linux/stackleak.h        | 24 +++++++++++++++
>  kernel/Makefile                  |  4 +++
>  kernel/fork.c                    |  3 ++
>  kernel/stackleak.c               | 63 ++++++++++++++++++++++++++++++++++++++++
>  12 files changed, 157 insertions(+)
>  create mode 100644 include/linux/stackleak.h
>  create mode 100644 kernel/stackleak.c
> 
> diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
> index 5432a96..600bc2a 100644
> --- a/Documentation/x86/x86_64/mm.txt
> +++ b/Documentation/x86/x86_64/mm.txt
> @@ -24,6 +24,7 @@ ffffffffa0000000 - fffffffffeffffff (1520 MB) module mapping space
>  [fixmap start]   - ffffffffff5fffff kernel-internal fixmap range
>  ffffffffff600000 - ffffffffff600fff (=4 kB) legacy vsyscall ABI
>  ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole
> +STACKLEAK_POISON value in this last hole: ffffffffffff4111
>  
>  Virtual memory map with 5 level page tables:
>  
> @@ -50,6 +51,7 @@ ffffffffa0000000 - fffffffffeffffff (1520 MB) module mapping space
>  [fixmap start]   - ffffffffff5fffff kernel-internal fixmap range
>  ffffffffff600000 - ffffffffff600fff (=4 kB) legacy vsyscall ABI
>  ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole
> +STACKLEAK_POISON value in this last hole: ffffffffffff4111
>  
>  Architecture defines a 64-bit virtual address. Implementations can support
>  less. Currently supported are 48- and 57-bit virtual addresses. Bits 63
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 1aa5906..57817f0 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -414,6 +414,13 @@ config PLUGIN_HOSTCC
>  	  Host compiler used to build GCC plugins.  This can be $(HOSTCXX),
>  	  $(HOSTCC), or a null string if GCC plugin is unsupported.
>  
> +config HAVE_ARCH_STACKLEAK
> +	bool
> +	help
> +	  An architecture should select this if it has the code which
> +	  fills the used part of the kernel stack with the STACKLEAK_POISON
> +	  value before returning from system calls.
> +
>  config HAVE_GCC_PLUGINS
>  	bool
>  	help
> @@ -549,6 +556,26 @@ config GCC_PLUGIN_RANDSTRUCT_PERFORMANCE
>  	  in structures.  This reduces the performance hit of RANDSTRUCT
>  	  at the cost of weakened randomization.
>  
> +config GCC_PLUGIN_STACKLEAK
> +	bool "Erase the kernel stack before returning from syscalls"
> +	depends on GCC_PLUGINS
> +	depends on HAVE_ARCH_STACKLEAK
> +	help
> +	  This option makes the kernel erase the kernel stack before
> +	  returning from system calls. That reduces the information which
> +	  kernel stack leak bugs can reveal and blocks some uninitialized
> +	  stack variable attacks. This option also blocks kernel stack depth
> +	  overflow caused by alloca (aka Stack Clash attack).

Nit, please pick one of these variants to refer to library functions:

	  overflow caused by 'alloca' (aka Stack Clash attack).
	  overflow caused by alloca() (aka Stack Clash attack).

Like you correctly did later on in a C comment block:

> + * STACKLEAK blocks stack depth overflow caused by alloca() (aka Stack Clash
> + * attack).
> + */


> +	  The tradeoff is the performance impact: on a single CPU system kernel
> +	  compilation sees a 1% slowdown, other systems and workloads may vary
> +	  and you are advised to test this feature on your expected workload
> +	  before deploying it.

Is there a way to patch this out runtime? I.e. if a distro enabled it, is there an 
easy way to disable much of the overhead without rebooting the kernel?

> --- /dev/null
> +++ b/include/linux/stackleak.h
> @@ -0,0 +1,24 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _LINUX_STACKLEAK_H
> +#define _LINUX_STACKLEAK_H
> +
> +#include <linux/sched.h>
> +#include <linux/sched/task_stack.h>
> +
> +/*
> + * Check that the poison value points to the unused hole in the
> + * virtual memory map for your platform.
> + */
> +#define STACKLEAK_POISON -0xBEEF
> +
> +#define STACKLEAK_POISON_CHECK_DEPTH 128
> +
> +static inline void stackleak_task_init(struct task_struct *task)
> +{
> +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
> +	task->lowest_stack = (unsigned long)end_of_stack(task) +
> +						sizeof(unsigned long);
> +#endif

Please don't break the line in such an ugly fashion - just keep the line long and 
ignore checkpatch, because the cure is worse than the disease.

> --- /dev/null
> +++ b/kernel/stackleak.c
> @@ -0,0 +1,63 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * This code fills the used part of the kernel stack with a poison value
> + * before returning to the userspace. It's a part of the STACKLEAK feature
> + * ported from grsecurity/PaX.

s/returning to the userspace
 /returning to userspace

s/It's a part of the STACKLEAK feature
 /It's part of the STACKLEAK feature

> + * Author: Alexander Popov <alex.popov@linux.com>
> + *
> + * STACKLEAK reduces the information which kernel stack leak bugs can
> + * reveal and blocks some uninitialized stack variable attacks. Moreover,
> + * STACKLEAK blocks stack depth overflow caused by alloca() (aka Stack Clash
> + * attack).
> + */
> +
> +#include <linux/stackleak.h>
> +
> +asmlinkage void stackleak_erase_kstack(void)

s/stackleak_erase_kstack
 /stackleak_erase

?

> +{
> +	/* It would be nice not to have 'kstack_ptr' and 'boundary' on stack */
> +	unsigned long kstack_ptr = current->lowest_stack;
> +	unsigned long boundary = kstack_ptr & ~(THREAD_SIZE - 1);
> +	unsigned int poison_count = 0;
> +	const unsigned int check_depth =
> +			STACKLEAK_POISON_CHECK_DEPTH / sizeof(unsigned long);

ugly linebreak.

> +
> +	/* Search for the poison value in the kernel stack */
> +	while (kstack_ptr > boundary && poison_count <= check_depth) {
> +		if (*(unsigned long *)kstack_ptr == STACKLEAK_POISON)
> +			poison_count++;
> +		else
> +			poison_count = 0;
> +
> +		kstack_ptr -= sizeof(unsigned long);
> +	}
> +
> +	/*
> +	 * One 'long int' at the bottom of the thread stack is reserved and
> +	 * should not be poisoned (see CONFIG_SCHED_STACK_END_CHECK).

Nit:

  s/CONFIG_SCHED_STACK_END_CHECK
   /CONFIG_SCHED_STACK_END_CHECK=y

> +	 */
> +	if (kstack_ptr == boundary)
> +		kstack_ptr += sizeof(unsigned long);
> +
> +	/*
> +	 * Now write the poison value to the kernel stack. Start from
> +	 * 'kstack_ptr' and move up till the new 'boundary'. We assume that
> +	 * the stack pointer doesn't change when we write poison.
> +	 */
> +	if (on_thread_stack())
> +		boundary = current_stack_pointer;
> +	else
> +		boundary = current_top_of_stack();
> +
> +	BUG_ON(boundary - kstack_ptr >= THREAD_SIZE);

Should never happen, right? If so then please make this:

	if (WARN_ON(boundary - kstack_ptr >= THREAD_SIZE))
		return;

or so, to make it non-fatal and to allow users to report it, should it trigger 
against all expectations.

> +
> +	while (kstack_ptr < boundary) {
> +		*(unsigned long *)kstack_ptr = STACKLEAK_POISON;
> +		kstack_ptr += sizeof(unsigned long);
> +	}
> +
> +	/* Reset the 'lowest_stack' value for the next syscall */
> +	current->lowest_stack = current_top_of_stack() - THREAD_SIZE / 64;
> +}

Nit, I'd write this as:

	current->lowest_stack = current_top_of_stack() - THREAD_SIZE/64;

to make it group better visually. (Again, ignore checkpatch if it complains, it's 
wrong.)

Overall I like the interface cleanups in v13, so if these nits are addressed and 
it becomes possible for (root users) to disable the checking then I suppose this 
is fine.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v13 (resend) 2/6] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-07-05  8:12   ` Ingo Molnar
@ 2018-07-05 10:00     ` Peter Zijlstra
  2018-07-05 10:35       ` Ingo Molnar
  2018-07-05 21:55     ` Alexander Popov
  1 sibling, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2018-07-05 10:00 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Alexander Popov, kernel-hardening, Kees Cook, PaX Team,
	Brad Spengler, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Dmitry V . Levin, Emese Revfy,
	Jonathan Corbet, Andrey Ryabinin, Kirill A . Shutemov,
	Thomas Garnier, Andrew Morton, Alexei Starovoitov, Josef Bacik,
	Masami Hiramatsu, Nicholas Piggin, Al Viro, David S . Miller,
	Ding Tianhong, David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Linus Torvalds,
	Greg Kroah-Hartman, Dan Williams, Dave Hansen, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon,
	Arnd Bergmann, Florian Weimer, Boris Lukashev, Andrey Konovalov,
	x86, linux-kernel

On Thu, Jul 05, 2018 at 10:12:37AM +0200, Ingo Molnar wrote:
> * Alexander Popov <alex.popov@linux.com> wrote:
> > +static inline void stackleak_task_init(struct task_struct *task)
> > +{
> > +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
> > +	task->lowest_stack = (unsigned long)end_of_stack(task) +
> > +						sizeof(unsigned long);
> > +#endif
> 
> Please don't break the line in such an ugly fashion - just keep the line long and 
> ignore checkpatch, because the cure is worse than the disease.

FWIW I don't totally hate on:

	task->lowest_stack =
		(unsigned long)end_of_stack(task) + sizeof(unsigned long);

which is a much more readable split.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v13 (resend) 2/6] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-07-05 10:00     ` Peter Zijlstra
@ 2018-07-05 10:35       ` Ingo Molnar
  0 siblings, 0 replies; 17+ messages in thread
From: Ingo Molnar @ 2018-07-05 10:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Alexander Popov, kernel-hardening, Kees Cook, PaX Team,
	Brad Spengler, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Dmitry V . Levin, Emese Revfy,
	Jonathan Corbet, Andrey Ryabinin, Kirill A . Shutemov,
	Thomas Garnier, Andrew Morton, Alexei Starovoitov, Josef Bacik,
	Masami Hiramatsu, Nicholas Piggin, Al Viro, David S . Miller,
	Ding Tianhong, David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Linus Torvalds,
	Greg Kroah-Hartman, Dan Williams, Dave Hansen, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon,
	Arnd Bergmann, Florian Weimer, Boris Lukashev, Andrey Konovalov,
	x86, linux-kernel


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Thu, Jul 05, 2018 at 10:12:37AM +0200, Ingo Molnar wrote:
> > * Alexander Popov <alex.popov@linux.com> wrote:
> > > +static inline void stackleak_task_init(struct task_struct *task)
> > > +{
> > > +#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
> > > +	task->lowest_stack = (unsigned long)end_of_stack(task) +
> > > +						sizeof(unsigned long);
> > > +#endif
> > 
> > Please don't break the line in such an ugly fashion - just keep the line long and 
> > ignore checkpatch, because the cure is worse than the disease.
> 
> FWIW I don't totally hate on:
> 
> 	task->lowest_stack =
> 		(unsigned long)end_of_stack(task) + sizeof(unsigned long);
> 
> which is a much more readable split.

Yeah, or rename 'task' to the regular 't' or 'p' and do:

	t->lowest_stack = (long)end_of_stack(t) + sizeof(long);

;-)

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v13 (resend) 2/6] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-07-05  8:12   ` Ingo Molnar
  2018-07-05 10:00     ` Peter Zijlstra
@ 2018-07-05 21:55     ` Alexander Popov
  2018-07-05 22:20       ` Ingo Molnar
  1 sibling, 1 reply; 17+ messages in thread
From: Alexander Popov @ 2018-07-05 21:55 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Andy Lutomirski, Tycho Andersen, Laura Abbott, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Linus Torvalds,
	Greg Kroah-Hartman, Dan Williams, Dave Hansen, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon,
	Arnd Bergmann, Florian Weimer, Boris Lukashev, Andrey Konovalov,
	x86, linux-kernel

Hello Ingo,

Thanks for your review! I'll fix the style issues you pointed at.

Please also see my answers below.

On 05.07.2018 11:12, Ingo Molnar wrote:
>> +	  The tradeoff is the performance impact: on a single CPU system kernel
>> +	  compilation sees a 1% slowdown, other systems and workloads may vary
>> +	  and you are advised to test this feature on your expected workload
>> +	  before deploying it.
> 
> Is there a way to patch this out runtime? I.e. if a distro enabled it, is there an 
> easy way to disable much of the overhead without rebooting the kernel?

Hm. We can't completely disable STACKLEAK in runtime, since STACKLEAK gcc plugin
performs compile-time instrumentation of the kernel code. So we can only chop
off a part of functionality, for example, by introducing some variable and
checking it before every stack erasing (additional performance impact), but the
kernel will stay uselessly instrumented. It doesn't look reasonable to me.


>> +	BUG_ON(boundary - kstack_ptr >= THREAD_SIZE);
> 
> Should never happen, right? 

Yes. It can happen only if 'current->lowest_stack' was corrupted.

> If so then please make this:
> 
> 	if (WARN_ON(boundary - kstack_ptr >= THREAD_SIZE))
> 		return;
> 
> or so, to make it non-fatal and to allow users to report it, should it trigger 
> against all expectations.

I've made an experiment. The results:
 1. BUG_ON() here doesn't freeze the kernel output - I see a full 'PANIC: double
fault' report;
 2. WARN_ON() here gives absolutely same 'PANIC: double fault' here.

So there is no reason to introduce the surplus "WARN_ON() + return" logic here.


> Overall I like the interface cleanups in v13, so if these nits are addressed and 
> it becomes possible for (root users) to disable the checking then I suppose this 
> is fine.

Thanks a lot for your positive attitude.

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v13 (resend) 2/6] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-07-05 21:55     ` Alexander Popov
@ 2018-07-05 22:20       ` Ingo Molnar
  2018-07-06 11:46         ` Alexander Popov
  0 siblings, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2018-07-05 22:20 UTC (permalink / raw)
  To: Alexander Popov
  Cc: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Andy Lutomirski, Tycho Andersen, Laura Abbott, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Linus Torvalds,
	Greg Kroah-Hartman, Dan Williams, Dave Hansen, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon,
	Arnd Bergmann, Florian Weimer, Boris Lukashev, Andrey Konovalov,
	x86, linux-kernel


* Alexander Popov <alex.popov@linux.com> wrote:

> Hello Ingo,
> 
> Thanks for your review! I'll fix the style issues you pointed at.
> 
> Please also see my answers below.
> 
> On 05.07.2018 11:12, Ingo Molnar wrote:
> >> +	  The tradeoff is the performance impact: on a single CPU system kernel
> >> +	  compilation sees a 1% slowdown, other systems and workloads may vary
> >> +	  and you are advised to test this feature on your expected workload
> >> +	  before deploying it.
> > 
> > Is there a way to patch this out runtime? I.e. if a distro enabled it, is there an 
> > easy way to disable much of the overhead without rebooting the kernel?
> 
> Hm. We can't completely disable STACKLEAK in runtime, since STACKLEAK gcc plugin
> performs compile-time instrumentation of the kernel code. So we can only chop
> off a part of functionality, for example, by introducing some variable and
> checking it before every stack erasing (additional performance impact), but the
> kernel will stay uselessly instrumented. It doesn't look reasonable to me.

Or we could use what every other performance critical instrumentation feature uses 
to reduce overhead (ftrace, perf): kernel patching.

> > If so then please make this:
> > 
> > 	if (WARN_ON(boundary - kstack_ptr >= THREAD_SIZE))
> > 		return;
> > 
> > or so, to make it non-fatal and to allow users to report it, should it trigger 
> > against all expectations.
> 
> I've made an experiment. The results:
>  1. BUG_ON() here doesn't freeze the kernel output - I see a full 'PANIC: double
> fault' report;

Only in text mode - very few users are using text mode.

>  2. WARN_ON() here gives absolutely same 'PANIC: double fault' here.

that should only happen if the kernel is otherwise already fatally corrupted, 
right?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v13 (resend) 2/6] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-07-05 22:20       ` Ingo Molnar
@ 2018-07-06 11:46         ` Alexander Popov
  0 siblings, 0 replies; 17+ messages in thread
From: Alexander Popov @ 2018-07-06 11:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Andy Lutomirski, Tycho Andersen, Laura Abbott, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Linus Torvalds,
	Greg Kroah-Hartman, Dan Williams, Dave Hansen, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon,
	Arnd Bergmann, Florian Weimer, Boris Lukashev, Andrey Konovalov,
	x86, linux-kernel

On 06.07.2018 01:20, Ingo Molnar wrote:
> 
> * Alexander Popov <alex.popov@linux.com> wrote:
> 
>> Hello Ingo,
>>
>> Thanks for your review! I'll fix the style issues you pointed at.
>>
>> Please also see my answers below.
>>
>> On 05.07.2018 11:12, Ingo Molnar wrote:
>>>> +	  The tradeoff is the performance impact: on a single CPU system kernel
>>>> +	  compilation sees a 1% slowdown, other systems and workloads may vary
>>>> +	  and you are advised to test this feature on your expected workload
>>>> +	  before deploying it.
>>>
>>> Is there a way to patch this out runtime? I.e. if a distro enabled it, is there an 
>>> easy way to disable much of the overhead without rebooting the kernel?
>>
>> Hm. We can't completely disable STACKLEAK in runtime, since STACKLEAK gcc plugin
>> performs compile-time instrumentation of the kernel code. So we can only chop
>> off a part of functionality, for example, by introducing some variable and
>> checking it before every stack erasing (additional performance impact), but the
>> kernel will stay uselessly instrumented. It doesn't look reasonable to me.
> 
> Or we could use what every other performance critical instrumentation feature uses 
> to reduce overhead (ftrace, perf): kernel patching.

I see. It would be a big separate research - how to combine those different
kinds of instrumentation. I would propose to postpone it until we have a request
for STACKLEAK runtime disabling.

>>> If so then please make this:
>>>
>>> 	if (WARN_ON(boundary - kstack_ptr >= THREAD_SIZE))
>>> 		return;
>>>
>>> or so, to make it non-fatal and to allow users to report it, should it trigger 
>>> against all expectations.
>>
>> I've made an experiment. The results:
>>  1. BUG_ON() here doesn't freeze the kernel output - I see a full 'PANIC: double
>> fault' report;
> 
> Only in text mode - very few users are using text mode.
> 
>>  2. WARN_ON() here gives absolutely same 'PANIC: double fault' here.
> 
> that should only happen if the kernel is otherwise already fatally corrupted, 
> right?

No, I mean WARN_ON() in stackleak_erase_kstack() gives the double fault just
like BUG_ON() (without any corruption). In my experiment I've made the following
change:

-       BUG_ON(boundary - kstack_ptr >= THREAD_SIZE);
+//     BUG_ON(boundary - kstack_ptr >= THREAD_SIZE);
+       WARN_ON(1);

It might be caused by the fact, that stackleak_erase_kstack() is called from the
trampoline stack just before returning to the userspace.

So I mean 'WARN_ON() + return' here wouldn't give any profit over a single
BUG_ON() check.

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-07-06 11:46 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-22 16:58 [PATCH v13 (resend) 0/6] Introduce the STACKLEAK feature and a test for it Alexander Popov
2018-06-22 16:58 ` [PATCH v13 (resend) 1/6] gcc-plugins: Clean up the cgraph_create_edge* macros Alexander Popov
2018-06-22 16:58 ` [PATCH v13 (resend) 2/6] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls Alexander Popov
2018-06-22 18:58   ` Laura Abbott
2018-06-27  0:13     ` Kees Cook
2018-06-27  0:21       ` Kees Cook
2018-06-27  0:55         ` Laura Abbott
2018-07-05  8:12   ` Ingo Molnar
2018-07-05 10:00     ` Peter Zijlstra
2018-07-05 10:35       ` Ingo Molnar
2018-07-05 21:55     ` Alexander Popov
2018-07-05 22:20       ` Ingo Molnar
2018-07-06 11:46         ` Alexander Popov
2018-06-22 16:58 ` [PATCH v13 (resend) 3/6] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack Alexander Popov
2018-06-22 16:58 ` [PATCH v13 (resend) 4/6] lkdtm: Add a test for STACKLEAK Alexander Popov
2018-06-22 16:58 ` [PATCH v13 (resend) 5/6] fs/proc: Show STACKLEAK metrics in the /proc file system Alexander Popov
2018-06-22 16:58 ` [PATCH v13 (resend) 6/6] doc: self-protection: Add information about STACKLEAK feature Alexander Popov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.