linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC v9 0/7] Introduce the STACKLEAK feature and a test for it
@ 2018-03-03 20:00 Alexander Popov
  2018-03-03 20:00 ` [PATCH RFC v9 1/7] gcc-plugins: Clean up the cgraph_create_edge* macros Alexander Popov
                   ` (7 more replies)
  0 siblings, 8 replies; 59+ messages in thread
From: Alexander Popov @ 2018-03-03 20:00 UTC (permalink / raw)
  To: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Dave Hansen, Mathias Krause, Vikas Shivappa,
	Kyle Huey, Dmitry Safonov, Will Deacon, Arnd Bergmann, x86,
	linux-kernel, alex.popov

This is the 9th version of the patch series introducing STACKLEAK to the
mainline kernel. STACKLEAK is a security feature developed by Grsecurity/PaX
(kudos to them), which:
 - reduces the information that can be revealed through kernel stack leak bugs;
 - blocks some uninitialized stack variable attacks (e.g. CVE-2017-17712,
    CVE-2010-2963);
 - introduces some runtime checks for kernel stack overflow detection.

Changes in v9
=============

1. Code style improvements (according to the feedback from Borislav Petkov):
   capitalized the ERASE_KSTACK macro, changed numeric labels to symbolic ones,
   improved the comments and documented STACKLEAK_POISON in mm.txt.

2. STACKLEAK gcc plugin improvements (special thanks to Laura Abbott
   and Richard Sandiford):
    - cleaned up the cgraph_create_edge* macros (in a separate commit);
    - introduced large_stack_frame() to handle get_frame_size() change in gcc-8;
    - considered the NOTE_INSN_CALL_ARG_LOCATION absence in gcc-8;
    - put the STACKLEAK RTL pass after the "reload" pass.

STACKLEAK instrumentation statistics
====================================

These numbers were obtained for the 4th version of the patch series.

Size of vmlinux (x86_64_defconfig):
 file size:
  - STACKLEAK disabled: 35014784 bytes
  - STACKLEAK enabled: 35044952 bytes (+0.086%)
 .text section size (calculated by size utility):
  - STACKLEAK disabled: 10752983
  - STACKLEAK enabled: 11062221 (+2.876%)

The readelf utility shows 45602 functions in vmlinux. The STACKLEAK gcc plugin
inserted 36 check_alloca() calls and 1265 track_stack() calls (42274 calls are
inserted during GIMPLE pass and 41009 calls are deleted during RTL pass).
So 2.853% of functions are instrumented.

STACKLEAK performance impact
============================

The STACKLEAK description in Kconfig includes:
"The tradeoff is the performance impact: on a single CPU system kernel
compilation sees a 1% slowdown, other systems and workloads may vary and you are
advised to test this feature on your expected workload before deploying it".

Here are the results of a brief performance test on x86_64 (for the 2nd version
of the patch). The numbers are very different because the STACKLEAK performance
penalty depends on the workloads.

Hardware: Intel Core i7-4770, 16 GB RAM

Test #1: building the Linux kernel with Ubuntu config (time make -j9)
Result on 4.11-rc8:
  real	32m14.893s
  user	237m30.886s
  sys	11m12.273s
Result on 4.11-rc8+stackleak:
  real	32m26.881s (+0.62%)
  user	238m38.926s (+0.48%)
  sys	11m36.426s (+3.59%)

Test #2: hackbench -s 4096 -l 2000 -g 15 -f 25 -P
Average result on 4.11-rc8: 8.71s
Average result on 4.11-rc8+stackleak: 9.08s (+4.29%)

Changelog
=========

Changes in v8

1. Renamed the STACKLEAK tests to STACKLEAK_ALLOCA and STACKLEAK_DEEP_RECURSION.
   Fixed some obsolete comments there.

2. Added the comments describing stack erasing on x86_32, because it differs
   a lot from one on x86_64 since v7.

Changes in v7

1. Improved erase_kstack() on x86_64. Now it detects which stack we are
   currently using. If we are on the thread stack, erase_kstack() writes the
   poison value up to the stack pointer. If we are not on the thread stack (we
   are on the trampoline stack introduced by Andy Lutomirski), erase_kstack()
   writes the poison value up to the cpu_current_top_of_stack.

   N.B. Right now there are two situations when erase_kstack() is called
   on the thread stack:
    - from sysret32_from_system_call in entry_64_compat.S;
    - from syscall_trace_enter() (see the 3rd patch of this series).

2. Introduced STACKLEAK_POISON_CHECK_DEPTH macro and BUILD_BUG_ON() checking
   its consistency with CONFIG_STACKLEAK_TRACK_MIN_SIZE.

3. Added "disable" option for the STACKLEAK gcc plugin (asked by Laura Abbott).

4. Improved CONFIG_STACKLEAK_METRICS. Now /proc/<pid>/stack_depth shows
   the maximum kernel stack consumption for the current and previous syscalls.
   Although this information is not precise, it can be useful for estimating
   the STACKLEAK performance impact for different workloads.

5. Removed redundant erase_kstack() calls from syscall_trace_enter().
   There is no need to erase the stack in syscall_trace_enter() when it
   returns -1 (thanks to Dmitry Levin for noticing that).

6. Introduced MIN_STACK_LEFT to get rid of hardcoded numbers in check_alloca().

Changes in v6

1. Examined syscall entry/exit paths.
    - Added missing erase_kstack() call at ret_from_fork() for x86_32.
    - Added missing erase_kstack() call at syscall_trace_enter().
    - Solved questions previously marked with TODO.

2. Rebased onto v4.15-rc2, which includes Andy Lutomirski's entry changes.
   Andy removed sp0 from thread_struct for x86_64, which was the only issue
   during rebasing.

3. Removed the recursive BUG() in track_stack() that was caused by the code
   instrumentation. Instead, CONFIG_GCC_PLUGIN_STACKLEAK now implies
   CONFIG_VMAP_STACK and CONFIG_SCHED_STACK_END_CHECK, which seems to be
   an optimal solution.

4. Put stack erasing in syscall_trace_enter() into a separate patch and
   fixed my mistake with secure_computing() (found by Tycho Andersen).

5. After some experiments, kept the asm implementation of erase_kstack(),
   because it gives a full control over the stack for clearing it neatly
   and doesn't offend KASAN.

6. Improved the comments describing STACKLEAK.

Changes in v5 (mostly according to the feedback from Ingo Molnar)

1. Introduced the CONFIG_STACKLEAK_METRICS providing STACKLEAK information
   about tasks via the /proc file system. That information can be useful for
   estimating the STACKLEAK performance impact for different workloads.
   In particular, /proc/<pid>/lowest_stack shows the current lowest_stack
   value and its final value from the previous syscall.

2. Introduced a single check_alloca() implementation working for both
   x86_64 and x86_32.

3. Fixed coding style issues and did some refactoring in the STACKLEAK
   gcc plugin.

4. Put the gcc plugin and the kernel stack erasing into separate (working)
   patches.

Changes in v4

1. Introduced the CONFIG_STACKLEAK_TRACK_MIN_SIZE parameter instead of
   hard-coded track-lowest-sp.

2. Carefully looked into the assertions in track_stack() and check_alloca().
    - Fixed the incorrect BUG() condition in track_stack(), thanks to Tycho
       Andersen. Also disabled that check if CONFIG_VMAP_STACK is enabled.
    - Fixed the surplus and erroneous code for calculating stack_left in
       check_alloca() on x86_64. That code repeats the work which is already
       done in get_stack_info() and it misses the fact that different
       exception stacks on x86_64 have different size.

3. Introduced two lkdtm tests for the STACKLEAK feature (developed together
   with Tycho Andersen).

4. Added info about STACKLEAK to Documentation/security/self-protection.rst.

5. Improved the comments.

6. Decided not to change "unsigned long sp = (unsigned long)&sp" to
   current_stack_pointer. The original variant is more platform independent
   since current_stack_pointer has different type on x86 and arm.

Changes in v3

1. Added the detailed comments describing the STACKLEAK gcc plugin.
   Read the plugin from the bottom up, like you do for Linux kernel drivers.
   Additional information:
    - gcc internals documentation available from gcc sources;
    - wonderful slides by Diego Novillo
       https://www.airs.com/dnovillo/200711-GCC-Internals/
    - nice introduction to gcc plugins at LWN
       https://lwn.net/Articles/457543/

2. Improved the commit message and Kconfig description according to the
   feedback from Kees Cook. Also added the original notice describing
   the performance impact of the STACKLEAK feature.

3. Removed the arch-specific ix86_cmodel check in stackleak_track_stack_gate().
   It caused skipping the kernel code instrumentation for i386.

4. Fixed a minor mistake in stackleak_tree_instrument_execute().
   First versions of the plugin used ENTRY_BLOCK_PTR_FOR_FN(cfun)->next_bb
   to get the basic block with the function prologue. That was not correct
   since the control flow graph edge from ENTRY_BLOCK_PTR doesn't always
   go to that basic block.

   So later it was changed to single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun)),
   but not completely. next_bb was still used for entry_bb assignment,
   which could cause the wrong value of the prologue_instrumented variable.

   I've reported this issue to Grsecurity and proposed the fix for it, but
   unfortunately didn't get any reply.

5. Introduced the STACKLEAK_POISON macro and renamed the config option
   according to the feedback from Kees Cook.

Ideas for further work
======================

 - Try to port STACKLEAK to arm64 (Laura Abbott is working on it).
 - Think of erasing stack on the way out of exception handlers (idea by
   Andy Lutomirski).
 - Think of erasing the kernel stack after invoking EFI runtime services
   (idea by Mark Rutland).


Alexander Popov (7):
  gcc-plugins: Clean up the cgraph_create_edge* macros
  x86/entry: Add STACKLEAK erasing the kernel stack at the end of
    syscalls
  gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
  x86/entry: Erase kernel stack in syscall_trace_enter()
  lkdtm: Add a test for STACKLEAK
  fs/proc: Show STACKLEAK metrics in the /proc file system
  doc: self-protection: Add information about STACKLEAK feature

 Documentation/security/self-protection.rst |  23 +-
 Documentation/x86/x86_64/mm.txt            |   2 +
 arch/Kconfig                               |  53 ++++
 arch/x86/Kconfig                           |   1 +
 arch/x86/entry/common.c                    |   7 +
 arch/x86/entry/entry_32.S                  |  92 ++++++
 arch/x86/entry/entry_64.S                  | 112 +++++++
 arch/x86/entry/entry_64_compat.S           |  11 +
 arch/x86/include/asm/processor.h           |   7 +
 arch/x86/kernel/asm-offsets.c              |  11 +
 arch/x86/kernel/dumpstack.c                |  20 ++
 arch/x86/kernel/process_32.c               |   8 +
 arch/x86/kernel/process_64.c               |   8 +
 drivers/misc/Makefile                      |   3 +
 drivers/misc/lkdtm.h                       |   4 +
 drivers/misc/lkdtm_core.c                  |   2 +
 drivers/misc/lkdtm_stackleak.c             | 136 +++++++++
 fs/exec.c                                  |  33 ++
 fs/proc/base.c                             |  18 ++
 include/linux/compiler.h                   |   6 +
 scripts/Makefile.gcc-plugins               |   3 +
 scripts/gcc-plugins/gcc-common.h           |  26 +-
 scripts/gcc-plugins/stackleak_plugin.c     | 470 +++++++++++++++++++++++++++++
 23 files changed, 1037 insertions(+), 19 deletions(-)
 create mode 100644 drivers/misc/lkdtm_stackleak.c
 create mode 100644 scripts/gcc-plugins/stackleak_plugin.c

-- 
2.7.4

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH RFC v9 1/7] gcc-plugins: Clean up the cgraph_create_edge* macros
  2018-03-03 20:00 [PATCH RFC v9 0/7] Introduce the STACKLEAK feature and a test for it Alexander Popov
@ 2018-03-03 20:00 ` Alexander Popov
  2018-03-03 20:00 ` [PATCH RFC v9 2/7] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls Alexander Popov
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 59+ messages in thread
From: Alexander Popov @ 2018-03-03 20:00 UTC (permalink / raw)
  To: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Dave Hansen, Mathias Krause, Vikas Shivappa,
	Kyle Huey, Dmitry Safonov, Will Deacon, Arnd Bergmann, x86,
	linux-kernel, alex.popov

Drop useless redefinitions of cgraph_create_edge* macros. Drop the unused
nest argument. Also support gcc-8, which doesn't have freq argument.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 scripts/gcc-plugins/gcc-common.h | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/scripts/gcc-plugins/gcc-common.h b/scripts/gcc-plugins/gcc-common.h
index f467500..552d5ef 100644
--- a/scripts/gcc-plugins/gcc-common.h
+++ b/scripts/gcc-plugins/gcc-common.h
@@ -392,13 +392,6 @@ static inline struct cgraph_node *cgraph_alias_target(struct cgraph_node *n)
 }
 #endif
 
-#if BUILDING_GCC_VERSION >= 4007 && BUILDING_GCC_VERSION <= 4009
-#define cgraph_create_edge(caller, callee, call_stmt, count, freq, nest) \
-	cgraph_create_edge((caller), (callee), (call_stmt), (count), (freq))
-#define cgraph_create_edge_including_clones(caller, callee, old_call_stmt, call_stmt, count, freq, nest, reason) \
-	cgraph_create_edge_including_clones((caller), (callee), (old_call_stmt), (call_stmt), (count), (freq), (reason))
-#endif
-
 #if BUILDING_GCC_VERSION <= 4008
 #define ENTRY_BLOCK_PTR_FOR_FN(FN)	ENTRY_BLOCK_PTR_FOR_FUNCTION(FN)
 #define EXIT_BLOCK_PTR_FOR_FN(FN)	EXIT_BLOCK_PTR_FOR_FUNCTION(FN)
@@ -723,10 +716,23 @@ static inline const char *get_decl_section_name(const_tree decl)
 #define varpool_get_node(decl) varpool_node::get(decl)
 #define dump_varpool_node(file, node) (node)->dump(file)
 
-#define cgraph_create_edge(caller, callee, call_stmt, count, freq, nest) \
+#if BUILDING_GCC_VERSION >= 8000
+#define cgraph_create_edge(caller, callee, call_stmt, count, freq) \
+	(caller)->create_edge((callee), (call_stmt), (count))
+
+#define cgraph_create_edge_including_clones(caller, callee,	\
+		old_call_stmt, call_stmt, count, freq, reason)	\
+	(caller)->create_edge_including_clones((callee),	\
+		(old_call_stmt), (call_stmt), (count), (reason))
+#else
+#define cgraph_create_edge(caller, callee, call_stmt, count, freq) \
 	(caller)->create_edge((callee), (call_stmt), (count), (freq))
-#define cgraph_create_edge_including_clones(caller, callee, old_call_stmt, call_stmt, count, freq, nest, reason) \
-	(caller)->create_edge_including_clones((callee), (old_call_stmt), (call_stmt), (count), (freq), (reason))
+
+#define cgraph_create_edge_including_clones(caller, callee,	\
+		old_call_stmt, call_stmt, count, freq, reason)	\
+	(caller)->create_edge_including_clones((callee),	\
+		(old_call_stmt), (call_stmt), (count), (freq), (reason))
+#endif
 
 typedef struct cgraph_node *cgraph_node_ptr;
 typedef struct cgraph_edge *cgraph_edge_p;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFC v9 2/7] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-03-03 20:00 [PATCH RFC v9 0/7] Introduce the STACKLEAK feature and a test for it Alexander Popov
  2018-03-03 20:00 ` [PATCH RFC v9 1/7] gcc-plugins: Clean up the cgraph_create_edge* macros Alexander Popov
@ 2018-03-03 20:00 ` Alexander Popov
  2018-03-05 16:41   ` Dave Hansen
  2018-03-03 20:00 ` [PATCH RFC v9 3/7] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack Alexander Popov
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 59+ messages in thread
From: Alexander Popov @ 2018-03-03 20:00 UTC (permalink / raw)
  To: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Dave Hansen, Mathias Krause, Vikas Shivappa,
	Kyle Huey, Dmitry Safonov, Will Deacon, Arnd Bergmann, x86,
	linux-kernel, alex.popov

The STACKLEAK feature erases the kernel stack before returning from
syscalls. That reduces the information which kernel stack leak bugs can
reveal and blocks some uninitialized stack variable attacks. Moreover,
STACKLEAK provides runtime checks for kernel stack overflow detection.

This commit introduces the architecture-specific code filling the used
part of the kernel stack with a poison value before returning to the
userspace. Full STACKLEAK feature also contains the gcc plugin which
comes in a separate commit.

The STACKLEAK feature is ported from grsecurity/PaX. More information at:
  https://grsecurity.net/
  https://pax.grsecurity.net/

This code is modified from Brad Spengler/PaX Team's code in the last
public patch of grsecurity/PaX based on our understanding of the code.
Changes or omissions from the original code are ours and don't reflect
the original grsecurity/PaX code.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 Documentation/x86/x86_64/mm.txt  |   2 +
 arch/Kconfig                     |  27 ++++++++++
 arch/x86/Kconfig                 |   1 +
 arch/x86/entry/entry_32.S        |  88 +++++++++++++++++++++++++++++++
 arch/x86/entry/entry_64.S        | 108 +++++++++++++++++++++++++++++++++++++++
 arch/x86/entry/entry_64_compat.S |  11 ++++
 arch/x86/include/asm/processor.h |   4 ++
 arch/x86/kernel/asm-offsets.c    |   8 +++
 arch/x86/kernel/process_32.c     |   5 ++
 arch/x86/kernel/process_64.c     |   5 ++
 include/linux/compiler.h         |   6 +++
 11 files changed, 265 insertions(+)

diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index ea91cb6..21ee7c5 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -24,6 +24,7 @@ ffffffffa0000000 - [fixmap start]   (~1526 MB) module mapping space (variable)
 [fixmap start]   - ffffffffff5fffff kernel-internal fixmap range
 ffffffffff600000 - ffffffffff600fff (=4 kB) legacy vsyscall ABI
 ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole
+STACKLEAK_POISON value in this last hole: ffffffffffff4111
 
 Virtual memory map with 5 level page tables:
 
@@ -50,6 +51,7 @@ ffffffffa0000000 - fffffffffeffffff (1520 MB) module mapping space
 [fixmap start]   - ffffffffff5fffff kernel-internal fixmap range
 ffffffffff600000 - ffffffffff600fff (=4 kB) legacy vsyscall ABI
 ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole
+STACKLEAK_POISON value in this last hole: ffffffffffff4111
 
 Architecture defines a 64-bit virtual address. Implementations can support
 less. Currently supported are 48- and 57-bit virtual addresses. Bits 63
diff --git a/arch/Kconfig b/arch/Kconfig
index 76c0b54..368e2fb 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -401,6 +401,13 @@ config SECCOMP_FILTER
 
 	  See Documentation/prctl/seccomp_filter.txt for details.
 
+config HAVE_ARCH_STACKLEAK
+	bool
+	help
+	  An architecture should select this if it has the code which
+	  fills the used part of the kernel stack with the STACKLEAK_POISON
+	  value before returning from system calls.
+
 config HAVE_GCC_PLUGINS
 	bool
 	help
@@ -531,6 +538,26 @@ config GCC_PLUGIN_RANDSTRUCT_PERFORMANCE
 	  in structures.  This reduces the performance hit of RANDSTRUCT
 	  at the cost of weakened randomization.
 
+config GCC_PLUGIN_STACKLEAK
+	bool "Erase the kernel stack before returning from syscalls"
+	depends on GCC_PLUGINS
+	depends on HAVE_ARCH_STACKLEAK
+	help
+	  This option makes the kernel erase the kernel stack before it
+	  returns from a system call. That reduces the information which
+	  kernel stack leak bugs can reveal and blocks some uninitialized
+	  stack variable attacks. This option also provides runtime checks
+	  for kernel stack overflow detection.
+
+	  The tradeoff is the performance impact: on a single CPU system kernel
+	  compilation sees a 1% slowdown, other systems and workloads may vary
+	  and you are advised to test this feature on your expected workload
+	  before deploying it.
+
+	  This plugin was ported from grsecurity/PaX. More information at:
+	   * https://grsecurity.net/
+	   * https://pax.grsecurity.net/
+
 config HAVE_CC_STACKPROTECTOR
 	bool
 	help
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index eb7f43f..715b5bd 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -119,6 +119,7 @@ config X86
 	select HAVE_ARCH_COMPAT_MMAP_BASES	if MMU && COMPAT
 	select HAVE_ARCH_SECCOMP_FILTER
 	select HAVE_ARCH_THREAD_STRUCT_WHITELIST
+	select HAVE_ARCH_STACKLEAK
 	select HAVE_ARCH_TRACEHOOK
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD if X86_64
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 6ad064c..068dde6 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -77,6 +77,89 @@
 #endif
 .endm
 
+.macro ERASE_KSTACK
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	call erase_kstack
+#endif
+.endm
+
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+ENTRY(erase_kstack)
+	pushl	%edi
+	pushl	%ecx
+	pushl	%eax
+	pushl	%ebp
+
+	movl	PER_CPU_VAR(current_task), %ebp
+	mov	TASK_lowest_stack(%ebp), %edi
+	mov	$STACKLEAK_POISON, %eax
+	std
+
+	/*
+	 * Let's search for the poison value in the stack.
+	 * Start from the lowest_stack and go to the bottom (see STD above).
+	 */
+.Lpoison_search:
+	mov	%edi, %ecx
+	and	$THREAD_SIZE_asm - 1, %ecx
+	shr	$2, %ecx
+	repne	scasl
+	jecxz	.Lpoisoning	/* Didn't find it */
+
+	/*
+	 * Found the poison value in the stack. Go to poisoning if there is
+	 * not enough space left for the poison check.
+	 */
+	cmp	$STACKLEAK_POISON_CHECK_DEPTH / 4, %ecx
+	jc	.Lpoisoning
+
+	/*
+	 * Check that some further dwords contain poison. If so, the part
+	 * of the stack below the address in %edi is likely to be poisoned.
+	 * Otherwise we need to search deeper.
+	 */
+	mov	$STACKLEAK_POISON_CHECK_DEPTH / 4, %ecx
+	repe	scasl
+	jecxz	.Lpoisoning
+	jne	.Lpoison_search
+
+.Lpoisoning:
+	/*
+	 * Prepare the counter for poisoning the kernel stack between
+	 * %edi and %esp. Two dwords at the bottom of the stack are reserved
+	 * and should not be poisoned (see CONFIG_SCHED_STACK_END_CHECK).
+	 */
+	or	$2 * 4, %edi
+	cld
+	mov	%esp, %ecx
+	sub	%edi, %ecx
+
+	cmp	$THREAD_SIZE_asm, %ecx
+	jb	.Lgood_counter
+	ud2
+
+.Lgood_counter:
+	/*
+	 * So let's write the poison value to the kernel stack. Start from the
+	 * address in %edi and move up (see CLD above) to the address in %esp
+	 * (not included, used memory).
+	 */
+	shr	$2, %ecx
+	rep	stosl
+
+	/* Set the lowest_stack value to the top_of_stack - 128 */
+	movl	PER_CPU_VAR(cpu_current_top_of_stack), %edi
+	sub	$128, %edi
+	mov	%edi, TASK_lowest_stack(%ebp)
+
+	popl	%ebp
+	popl	%eax
+	popl	%ecx
+	popl	%edi
+	ret
+ENDPROC(erase_kstack)
+#endif
+
 /*
  * User gs save/restore
  *
@@ -298,6 +381,7 @@ ENTRY(ret_from_fork)
 	/* When we fork, we trace the syscall return in the child, too. */
 	movl    %esp, %eax
 	call    syscall_return_slowpath
+	ERASE_KSTACK
 	jmp     restore_all
 
 	/* kernel thread */
@@ -458,6 +542,8 @@ ENTRY(entry_SYSENTER_32)
 	ALTERNATIVE "testl %eax, %eax; jz .Lsyscall_32_done", \
 		    "jmp .Lsyscall_32_done", X86_FEATURE_XENPV
 
+	ERASE_KSTACK
+
 /* Opportunistic SYSEXIT */
 	TRACE_IRQS_ON			/* User mode traces as IRQs on. */
 	movl	PT_EIP(%esp), %edx	/* pt_regs->ip */
@@ -544,6 +630,8 @@ ENTRY(entry_INT80_32)
 	call	do_int80_syscall_32
 .Lsyscall_32_done:
 
+	ERASE_KSTACK
+
 restore_all:
 	TRACE_IRQS_IRET
 .Lrestore_all_notrace:
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index d5c7f18..9b360f8 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -66,6 +66,111 @@ END(native_usergs_sysret64)
 	TRACE_IRQS_FLAGS EFLAGS(%rsp)
 .endm
 
+.macro ERASE_KSTACK
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	call erase_kstack
+#endif
+.endm
+
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+ENTRY(erase_kstack)
+	pushq	%rdi
+	pushq	%rcx
+	pushq	%rax
+	pushq	%r11
+
+	mov	PER_CPU_VAR(current_task), %r11
+	mov	TASK_lowest_stack(%r11), %rdi
+	mov	$STACKLEAK_POISON, %rax
+	std
+
+	/*
+	 * Let's search for the poison value in the stack.
+	 * Start from the lowest_stack and go to the bottom (see STD above).
+	 */
+.Lpoison_search:
+	mov	%edi, %ecx
+	and	$THREAD_SIZE_asm - 1, %ecx
+	shr	$3, %ecx
+	repne	scasq
+	jecxz	.Lpoisoning	/* Didn't find it */
+
+	/*
+	 * Found the poison value in the stack. Go to poisoning if there is
+	 * not enough space left for the poison check.
+	 */
+	cmp	$STACKLEAK_POISON_CHECK_DEPTH / 8, %ecx
+	jb	.Lpoisoning
+
+	/*
+	 * Check that some further qwords contain poison. If so, the part
+	 * of the stack below the address in %rdi is likely to be poisoned.
+	 * Otherwise we need to search deeper.
+	 */
+	mov	$STACKLEAK_POISON_CHECK_DEPTH / 8, %ecx
+	repe	scasq
+	jecxz	.Lpoisoning
+	jne	.Lpoison_search
+
+.Lpoisoning:
+	/*
+	 * Two qwords at the bottom of the thread stack are reserved and
+	 * should not be poisoned (see CONFIG_SCHED_STACK_END_CHECK).
+	 */
+	or	$2 * 8, %rdi
+
+	/*
+	 * Check whether we are on the thread stack to prepare the counter
+	 * for stack poisoning.
+	 */
+	mov	PER_CPU_VAR(cpu_current_top_of_stack), %rcx
+	sub	%rsp, %rcx
+	cmp	$THREAD_SIZE_asm, %rcx
+	jb	.Lon_thread_stack
+
+	/*
+	 * We are not on the thread stack, so we can write poison between
+	 * the address in %rdi and the stack top.
+	 */
+	mov	PER_CPU_VAR(cpu_current_top_of_stack), %rcx
+	sub	%rdi, %rcx
+	jmp	.Lcounter_check
+
+.Lon_thread_stack:
+	/*
+	 * We can write poison between the address in %rdi and the address
+	 * in %rsp (not included, used memory).
+	 */
+	mov	%rsp, %rcx
+	sub	%rdi, %rcx
+
+.Lcounter_check:
+	cmp	$THREAD_SIZE_asm, %rcx
+	jb	.Lgood_counter
+	ud2
+
+.Lgood_counter:
+	/*
+	 * So let's write the poison value to the kernel stack. Start from the
+	 * address in %rdi and move up (see CLD).
+	 */
+	cld
+	shr	$3, %ecx
+	rep	stosq
+
+	/* Set the lowest_stack value to the top_of_stack - 256 */
+	mov	PER_CPU_VAR(cpu_current_top_of_stack), %rdi
+	sub	$256, %rdi
+	mov	%rdi, TASK_lowest_stack(%r11)
+
+	popq	%r11
+	popq	%rax
+	popq	%rcx
+	popq	%rdi
+	ret
+ENDPROC(erase_kstack)
+#endif
+
 /*
  * When dynamic function tracer is enabled it will add a breakpoint
  * to all locations that it is about to modify, sync CPUs, update
@@ -323,6 +428,8 @@ syscall_return_via_sysret:
 	 * We are on the trampoline stack.  All regs except RDI are live.
 	 * We can do future final exit work right here.
 	 */
+	ERASE_KSTACK
+
 	SWITCH_TO_USER_CR3_STACK scratch_reg=%rdi
 
 	popq	%rdi
@@ -681,6 +788,7 @@ GLOBAL(swapgs_restore_regs_and_return_to_usermode)
 	 * We are on the trampoline stack.  All regs except RDI are live.
 	 * We can do future final exit work right here.
 	 */
+	ERASE_KSTACK
 
 	SWITCH_TO_USER_CR3_STACK scratch_reg=%rdi
 
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index e811dd9..8516da7 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -19,6 +19,12 @@
 
 	.section .entry.text, "ax"
 
+	.macro ERASE_KSTACK
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	call erase_kstack
+#endif
+	.endm
+
 /*
  * 32-bit SYSENTER entry.
  *
@@ -258,6 +264,11 @@ GLOBAL(entry_SYSCALL_compat_after_hwframe)
 
 	/* Opportunistic SYSRET */
 sysret32_from_system_call:
+	/*
+	 * We are not going to return to the userspace from the trampoline
+	 * stack. So let's erase the thread stack right now.
+	 */
+	ERASE_KSTACK
 	TRACE_IRQS_ON			/* User mode traces as IRQs on. */
 	movq	RBX(%rsp), %rbx		/* pt_regs->rbx */
 	movq	RBP(%rsp), %rbp		/* pt_regs->rbp */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index b0ccd48..0c87813 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -494,6 +494,10 @@ struct thread_struct {
 
 	mm_segment_t		addr_limit;
 
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	unsigned long		lowest_stack;
+#endif
+
 	unsigned int		sig_on_uaccess_err:1;
 	unsigned int		uaccess_err:1;	/* uaccess failed */
 
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index 76417a9..ef5d260 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -39,6 +39,9 @@ void common(void) {
 	BLANK();
 	OFFSET(TASK_TI_flags, task_struct, thread_info.flags);
 	OFFSET(TASK_addr_limit, task_struct, thread.addr_limit);
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	OFFSET(TASK_lowest_stack, task_struct, thread.lowest_stack);
+#endif
 
 	BLANK();
 	OFFSET(crypto_tfm_ctx_offset, crypto_tfm, __crt_ctx);
@@ -75,6 +78,11 @@ void common(void) {
 	OFFSET(PV_MMU_read_cr2, pv_mmu_ops, read_cr2);
 #endif
 
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	BLANK();
+	DEFINE(THREAD_SIZE_asm, THREAD_SIZE);
+#endif
+
 #ifdef CONFIG_XEN
 	BLANK();
 	OFFSET(XEN_vcpu_info_mask, vcpu_info, evtchn_upcall_mask);
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 5224c60..6d256ab 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -136,6 +136,11 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
 	p->thread.sp0 = (unsigned long) (childregs+1);
 	memset(p->thread.ptrace_bps, 0, sizeof(p->thread.ptrace_bps));
 
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	p->thread.lowest_stack = (unsigned long)task_stack_page(p) +
+						2 * sizeof(unsigned long);
+#endif
+
 	if (unlikely(p->flags & PF_KTHREAD)) {
 		/* kernel thread */
 		memset(childregs, 0, sizeof(struct pt_regs));
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 9eb448c..6dc55f6 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -281,6 +281,11 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
 	p->thread.sp = (unsigned long) fork_frame;
 	p->thread.io_bitmap_ptr = NULL;
 
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+	p->thread.lowest_stack = (unsigned long)task_stack_page(p) +
+						2 * sizeof(unsigned long);
+#endif
+
 	savesegment(gs, p->thread.gsindex);
 	p->thread.gsbase = p->thread.gsindex ? 0 : me->thread.gsbase;
 	savesegment(fs, p->thread.fsindex);
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index ab4711c..47ea254 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -342,4 +342,10 @@ unsigned long read_word_at_a_time(const void *addr)
 	compiletime_assert(__native_word(t),				\
 		"Need native word sized stores/loads for atomicity.")
 
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+/* Poison value points to the unused hole in the virtual memory map */
+# define STACKLEAK_POISON -0xBEEF
+# define STACKLEAK_POISON_CHECK_DEPTH 128
+#endif
+
 #endif /* __LINUX_COMPILER_H */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFC v9 3/7] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack
  2018-03-03 20:00 [PATCH RFC v9 0/7] Introduce the STACKLEAK feature and a test for it Alexander Popov
  2018-03-03 20:00 ` [PATCH RFC v9 1/7] gcc-plugins: Clean up the cgraph_create_edge* macros Alexander Popov
  2018-03-03 20:00 ` [PATCH RFC v9 2/7] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls Alexander Popov
@ 2018-03-03 20:00 ` Alexander Popov
  2018-03-03 20:00 ` [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter() Alexander Popov
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 59+ messages in thread
From: Alexander Popov @ 2018-03-03 20:00 UTC (permalink / raw)
  To: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Dave Hansen, Mathias Krause, Vikas Shivappa,
	Kyle Huey, Dmitry Safonov, Will Deacon, Arnd Bergmann, x86,
	linux-kernel, alex.popov

The STACKLEAK feature erases the kernel stack before returning from
syscalls. That reduces the information which kernel stack leak bugs can
reveal and blocks some uninitialized stack variable attacks. Moreover,
STACKLEAK provides runtime checks for kernel stack overflow detection.

This commit introduces the STACKLEAK gcc plugin. It is needed for:
 - tracking the lowest border of the kernel stack, which is important
    for the code erasing the used part of the kernel stack at the end
    of syscalls (comes in a separate commit);
 - checking that alloca calls don't cause stack overflow.

So this plugin instruments the kernel code inserting:
 - the check_alloca() call before alloca and the track_stack() call
    after it;
 - the track_stack() call for the functions with a stack frame size
    greater than or equal to CONFIG_STACKLEAK_TRACK_MIN_SIZE.

The STACKLEAK feature is ported from grsecurity/PaX. More information at:
  https://grsecurity.net/
  https://pax.grsecurity.net/

This code is modified from Brad Spengler/PaX Team's code in the last
public patch of grsecurity/PaX based on our understanding of the code.
Changes or omissions from the original code are ours and don't reflect
the original grsecurity/PaX code.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 arch/Kconfig                           |  14 +
 arch/x86/kernel/dumpstack.c            |  20 ++
 fs/exec.c                              |  33 +++
 scripts/Makefile.gcc-plugins           |   3 +
 scripts/gcc-plugins/stackleak_plugin.c | 470 +++++++++++++++++++++++++++++++++
 5 files changed, 540 insertions(+)
 create mode 100644 scripts/gcc-plugins/stackleak_plugin.c

diff --git a/arch/Kconfig b/arch/Kconfig
index 368e2fb..a4a8fba 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -542,6 +542,8 @@ config GCC_PLUGIN_STACKLEAK
 	bool "Erase the kernel stack before returning from syscalls"
 	depends on GCC_PLUGINS
 	depends on HAVE_ARCH_STACKLEAK
+	imply VMAP_STACK
+	imply SCHED_STACK_END_CHECK
 	help
 	  This option makes the kernel erase the kernel stack before it
 	  returns from a system call. That reduces the information which
@@ -558,6 +560,18 @@ config GCC_PLUGIN_STACKLEAK
 	   * https://grsecurity.net/
 	   * https://pax.grsecurity.net/
 
+config STACKLEAK_TRACK_MIN_SIZE
+	int "Minimum stack frame size of functions tracked by STACKLEAK"
+	default 100
+	range 0 4096
+	depends on GCC_PLUGIN_STACKLEAK
+	help
+	  The STACKLEAK gcc plugin instruments the kernel code for tracking
+	  the lowest border of the kernel stack (and for some other purposes).
+	  It inserts the track_stack() call for the functions with a stack
+	  frame size greater than or equal to this parameter.
+	  If unsure, leave the default value 100.
+
 config HAVE_CC_STACKPROTECTOR
 	bool
 	help
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index a2d8a39..b53163f 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -375,3 +375,23 @@ static int __init code_bytes_setup(char *s)
 	return 1;
 }
 __setup("code_bytes=", code_bytes_setup);
+
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+
+#define MIN_STACK_LEFT 256
+
+void __used check_alloca(unsigned long size)
+{
+	unsigned long sp = (unsigned long)&sp;
+	struct stack_info stack_info = {0};
+	unsigned long visit_mask = 0;
+	unsigned long stack_left;
+
+	BUG_ON(get_stack_info(&sp, current, &stack_info, &visit_mask));
+
+	stack_left = sp - (unsigned long)stack_info.begin;
+	BUG_ON(stack_left < MIN_STACK_LEFT ||
+				size >= stack_left - MIN_STACK_LEFT);
+}
+EXPORT_SYMBOL(check_alloca);
+#endif
diff --git a/fs/exec.c b/fs/exec.c
index 7eb8d21..9865612 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1962,3 +1962,36 @@ COMPAT_SYSCALL_DEFINE5(execveat, int, fd,
 				  argv, envp, flags);
 }
 #endif
+
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+void __used track_stack(void)
+{
+	/*
+	 * N.B. The arch-specific part of the STACKLEAK feature fills the
+	 * kernel stack with the poison value, which has the register width.
+	 * That code assumes that the value of thread.lowest_stack is aligned
+	 * on the register width boundary.
+	 *
+	 * That is true for x86 and x86_64 because of the kernel stack
+	 * alignment on these platforms (for details, see cc_stack_align in
+	 * arch/x86/Makefile). Take care of that when you port STACKLEAK to
+	 * new platforms.
+	 */
+	unsigned long sp = (unsigned long)&sp;
+
+	/*
+	 * Having CONFIG_STACKLEAK_TRACK_MIN_SIZE larger than
+	 * STACKLEAK_POISON_CHECK_DEPTH makes the poison search in
+	 * erase_kstack() unreliable. Let's prevent that.
+	 */
+	BUILD_BUG_ON(CONFIG_STACKLEAK_TRACK_MIN_SIZE >
+						STACKLEAK_POISON_CHECK_DEPTH);
+
+	if (sp < current->thread.lowest_stack &&
+	    sp >= (unsigned long)task_stack_page(current) +
+					2 * sizeof(unsigned long)) {
+		current->thread.lowest_stack = sp;
+	}
+}
+EXPORT_SYMBOL(track_stack);
+#endif /* CONFIG_GCC_PLUGIN_STACKLEAK */
diff --git a/scripts/Makefile.gcc-plugins b/scripts/Makefile.gcc-plugins
index b2a95af..8d6070f 100644
--- a/scripts/Makefile.gcc-plugins
+++ b/scripts/Makefile.gcc-plugins
@@ -35,6 +35,9 @@ ifdef CONFIG_GCC_PLUGINS
   gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_RANDSTRUCT)	+= -DRANDSTRUCT_PLUGIN
   gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_RANDSTRUCT_PERFORMANCE)	+= -fplugin-arg-randomize_layout_plugin-performance-mode
 
+  gcc-plugin-$(CONFIG_GCC_PLUGIN_STACKLEAK)	+= stackleak_plugin.so
+  gcc-plugin-cflags-$(CONFIG_GCC_PLUGIN_STACKLEAK)	+= -DSTACKLEAK_PLUGIN -fplugin-arg-stackleak_plugin-track-min-size=$(CONFIG_STACKLEAK_TRACK_MIN_SIZE)
+
   GCC_PLUGINS_CFLAGS := $(strip $(addprefix -fplugin=$(objtree)/scripts/gcc-plugins/, $(gcc-plugin-y)) $(gcc-plugin-cflags-y))
 
   export PLUGINCC GCC_PLUGINS_CFLAGS GCC_PLUGIN GCC_PLUGIN_SUBDIR
diff --git a/scripts/gcc-plugins/stackleak_plugin.c b/scripts/gcc-plugins/stackleak_plugin.c
new file mode 100644
index 0000000..6ac2a05
--- /dev/null
+++ b/scripts/gcc-plugins/stackleak_plugin.c
@@ -0,0 +1,470 @@
+/*
+ * Copyright 2011-2017 by the PaX Team <pageexec@freemail.hu>
+ * Modified by Alexander Popov <alex.popov@linux.com>
+ * Licensed under the GPL v2
+ *
+ * Note: the choice of the license means that the compilation process is
+ * NOT 'eligible' as defined by gcc's library exception to the GPL v3,
+ * but for the kernel it doesn't matter since it doesn't link against
+ * any of the gcc libraries
+ *
+ * This gcc plugin is needed for tracking the lowest border of the kernel stack
+ * and checking that alloca calls don't cause stack overflow. It instruments
+ * the kernel code inserting:
+ *  - the check_alloca() call before alloca and the track_stack() call after it;
+ *  - the track_stack() call for the functions with a stack frame size greater
+ *     than or equal to the "track-min-size" plugin parameter.
+ *
+ * This plugin is ported from grsecurity/PaX. For more information see:
+ *   https://grsecurity.net/
+ *   https://pax.grsecurity.net/
+ *
+ * Debugging:
+ *  - use fprintf() to stderr, debug_generic_expr(), debug_gimple_stmt(),
+ *     print_rtl() and print_simple_rtl();
+ *  - add "-fdump-tree-all -fdump-rtl-all" to the plugin CFLAGS in
+ *     Makefile.gcc-plugins to see the verbose dumps of the gcc passes;
+ *  - use gcc -E to understand the preprocessing shenanigans;
+ *  - use gcc with enabled CFG/GIMPLE/SSA verification (--enable-checking).
+ */
+
+#include "gcc-common.h"
+
+__visible int plugin_is_GPL_compatible;
+
+static int track_frame_size = -1;
+static const char track_function[] = "track_stack";
+static const char check_function[] = "check_alloca";
+
+/*
+ * Mark these global variables (roots) for gcc garbage collector since
+ * they point to the garbage-collected memory.
+ */
+static GTY(()) tree track_function_decl;
+static GTY(()) tree check_function_decl;
+
+static struct plugin_info stackleak_plugin_info = {
+	.version = "201707101337",
+	.help = "track-min-size=nn\ttrack stack for functions with a stack frame size >= nn bytes\n"
+		"disable\t\tdo not activate the plugin\n"
+};
+
+static void stackleak_add_check_alloca(gimple_stmt_iterator *gsi)
+{
+	gimple stmt;
+	gcall *check_alloca;
+	tree alloca_size;
+	cgraph_node_ptr node;
+	int frequency;
+	basic_block bb;
+
+	/* Insert call to void check_alloca(unsigned long size) */
+	alloca_size = gimple_call_arg(gsi_stmt(*gsi), 0);
+	stmt = gimple_build_call(check_function_decl, 1, alloca_size);
+	check_alloca = as_a_gcall(stmt);
+	gsi_insert_before(gsi, check_alloca, GSI_SAME_STMT);
+
+	/* Update the cgraph */
+	bb = gimple_bb(check_alloca);
+	node = cgraph_get_create_node(check_function_decl);
+	gcc_assert(node);
+	frequency = compute_call_stmt_bb_frequency(current_function_decl, bb);
+	cgraph_create_edge(cgraph_get_node(current_function_decl), node,
+			check_alloca, bb->count, frequency);
+}
+
+static void stackleak_add_track_stack(gimple_stmt_iterator *gsi, bool after)
+{
+	gimple stmt;
+	gcall *track_stack;
+	cgraph_node_ptr node;
+	int frequency;
+	basic_block bb;
+
+	/* Insert call to void track_stack(void) */
+	stmt = gimple_build_call(track_function_decl, 0);
+	track_stack = as_a_gcall(stmt);
+	if (after)
+		gsi_insert_after(gsi, track_stack, GSI_CONTINUE_LINKING);
+	else
+		gsi_insert_before(gsi, track_stack, GSI_SAME_STMT);
+
+	/* Update the cgraph */
+	bb = gimple_bb(track_stack);
+	node = cgraph_get_create_node(track_function_decl);
+	gcc_assert(node);
+	frequency = compute_call_stmt_bb_frequency(current_function_decl, bb);
+	cgraph_create_edge(cgraph_get_node(current_function_decl), node,
+			track_stack, bb->count, frequency);
+}
+
+static bool is_alloca(gimple stmt)
+{
+	if (gimple_call_builtin_p(stmt, BUILT_IN_ALLOCA))
+		return true;
+
+#if BUILDING_GCC_VERSION >= 4007
+	if (gimple_call_builtin_p(stmt, BUILT_IN_ALLOCA_WITH_ALIGN))
+		return true;
+#endif
+
+	return false;
+}
+
+/*
+ * Work with the GIMPLE representation of the code.
+ * Insert the check_alloca() call before alloca and track_stack() call after
+ * it. Also insert track_stack() call into the beginning of the function
+ * if it is not instrumented.
+ */
+static unsigned int stackleak_instrument_execute(void)
+{
+	basic_block bb, entry_bb;
+	bool prologue_instrumented = false, is_leaf = true;
+	gimple_stmt_iterator gsi;
+
+	/*
+	 * ENTRY_BLOCK_PTR is a basic block which represents possible entry
+	 * point of a function. This block does not contain any code and
+	 * has a CFG edge to its successor.
+	 */
+	gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+	entry_bb = single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun));
+
+	/*
+	 * 1. Loop through the GIMPLE statements in each of cfun basic blocks.
+	 * cfun is a global variable which represents the function that is
+	 * currently processed.
+	 */
+	FOR_EACH_BB_FN(bb, cfun) {
+		for (gsi = gsi_start_bb(bb); !gsi_end_p(gsi); gsi_next(&gsi)) {
+			gimple stmt;
+
+			stmt = gsi_stmt(gsi);
+
+			/* Leaf function is a function which makes no calls */
+			if (is_gimple_call(stmt))
+				is_leaf = false;
+
+			if (!is_alloca(stmt))
+				continue;
+
+			/* 2. Insert stack overflow check before alloca call */
+			stackleak_add_check_alloca(&gsi);
+
+			/* 3. Insert track_stack() call after alloca call */
+			stackleak_add_track_stack(&gsi, true);
+			if (bb == entry_bb)
+				prologue_instrumented = true;
+		}
+	}
+
+	if (prologue_instrumented)
+		return 0;
+
+	/*
+	 * Special cases to skip the instrumentation.
+	 *
+	 * Taking the address of static inline functions materializes them,
+	 * but we mustn't instrument some of them as the resulting stack
+	 * alignment required by the function call ABI will break other
+	 * assumptions regarding the expected (but not otherwise enforced)
+	 * register clobbering ABI.
+	 *
+	 * Case in point: native_save_fl on amd64 when optimized for size
+	 * clobbers rdx if it were instrumented here.
+	 *
+	 * TODO: any more special cases?
+	 */
+	if (is_leaf &&
+	    !TREE_PUBLIC(current_function_decl) &&
+	    DECL_DECLARED_INLINE_P(current_function_decl)) {
+		return 0;
+	}
+
+	if (is_leaf &&
+	    !strncmp(IDENTIFIER_POINTER(DECL_NAME(current_function_decl)),
+		     "_paravirt_", 10)) {
+		return 0;
+	}
+
+	/* 4. Insert track_stack() call at the function beginning */
+	bb = entry_bb;
+	if (!single_pred_p(bb)) {
+		/* gcc_assert(bb_loop_depth(bb) ||
+				(bb->flags & BB_IRREDUCIBLE_LOOP)); */
+		split_edge(single_succ_edge(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+		gcc_assert(single_succ_p(ENTRY_BLOCK_PTR_FOR_FN(cfun)));
+		bb = single_succ(ENTRY_BLOCK_PTR_FOR_FN(cfun));
+	}
+	gsi = gsi_after_labels(bb);
+	stackleak_add_track_stack(&gsi, false);
+
+	return 0;
+}
+
+static bool large_stack_frame(void)
+{
+#if BUILDING_GCC_VERSION >= 8000
+	return maybe_ge(get_frame_size(), track_frame_size);
+#else
+	return (get_frame_size() >= track_frame_size);
+#endif
+}
+
+/*
+ * Work with the RTL representation of the code.
+ * Remove the unneeded track_stack() calls from the functions which don't
+ * call alloca and don't have a large enough stack frame size.
+ */
+static unsigned int stackleak_cleanup_execute(void)
+{
+	rtx_insn *insn, *next;
+
+	if (cfun->calls_alloca)
+		return 0;
+
+	if (large_stack_frame())
+		return 0;
+
+	/*
+	 * 1. Find track_stack() calls. Loop through the chain of insns,
+	 * which is an RTL representation of the code for a function.
+	 *
+	 * The example of a matching insn:
+	 *    (call_insn 8 4 10 2 (call (mem (symbol_ref ("track_stack")
+	 *    [flags 0x41] <function_decl 0x7f7cd3302a80 track_stack>)
+	 *    [0 track_stack S1 A8]) (0)) 675 {*call} (expr_list
+	 *    (symbol_ref ("track_stack") [flags 0x41] <function_decl
+	 *    0x7f7cd3302a80 track_stack>) (expr_list (0) (nil))) (nil))
+	 */
+	for (insn = get_insns(); insn; insn = next) {
+		rtx body;
+
+		next = NEXT_INSN(insn);
+
+		/* Check the expression code of the insn */
+		if (!CALL_P(insn))
+			continue;
+
+		/*
+		 * Check the expression code of the insn body, which is an RTL
+		 * Expression (RTX) describing the side effect performed by
+		 * that insn.
+		 */
+		body = PATTERN(insn);
+		if (GET_CODE(body) != CALL)
+			continue;
+
+		/*
+		 * Check the first operand of the call expression. It should
+		 * be a mem RTX describing the needed subroutine with a
+		 * symbol_ref RTX.
+		 */
+		body = XEXP(body, 0);
+		if (GET_CODE(body) != MEM)
+			continue;
+
+		body = XEXP(body, 0);
+		if (GET_CODE(body) != SYMBOL_REF)
+			continue;
+
+		if (SYMBOL_REF_DECL(body) != track_function_decl)
+			continue;
+
+		/* 2. Delete the track_stack() call */
+		delete_insn_and_edges(insn);
+#if BUILDING_GCC_VERSION >= 4007 && BUILDING_GCC_VERSION < 8000
+		if (GET_CODE(next) == NOTE &&
+		    NOTE_KIND(next) == NOTE_INSN_CALL_ARG_LOCATION) {
+			insn = next;
+			next = NEXT_INSN(insn);
+			delete_insn_and_edges(insn);
+		}
+#endif
+	}
+
+	return 0;
+}
+
+static bool stackleak_gate(void)
+{
+	tree section;
+
+	section = lookup_attribute("section",
+				   DECL_ATTRIBUTES(current_function_decl));
+	if (section && TREE_VALUE(section)) {
+		section = TREE_VALUE(TREE_VALUE(section));
+
+		if (!strncmp(TREE_STRING_POINTER(section), ".init.text", 10))
+			return false;
+		if (!strncmp(TREE_STRING_POINTER(section), ".devinit.text", 13))
+			return false;
+		if (!strncmp(TREE_STRING_POINTER(section), ".cpuinit.text", 13))
+			return false;
+		if (!strncmp(TREE_STRING_POINTER(section), ".meminit.text", 13))
+			return false;
+	}
+
+	return track_frame_size >= 0;
+}
+
+/* Build function declarations for track_stack() and check_alloca() */
+static void stackleak_start_unit(void *gcc_data __unused,
+				 void *user_data __unused)
+{
+	tree fntype;
+
+	/* void track_stack(void) */
+	fntype = build_function_type_list(void_type_node, NULL_TREE);
+	track_function_decl = build_fn_decl(track_function, fntype);
+	DECL_ASSEMBLER_NAME(track_function_decl); /* for LTO */
+	TREE_PUBLIC(track_function_decl) = 1;
+	TREE_USED(track_function_decl) = 1;
+	DECL_EXTERNAL(track_function_decl) = 1;
+	DECL_ARTIFICIAL(track_function_decl) = 1;
+	DECL_PRESERVE_P(track_function_decl) = 1;
+
+	/* void check_alloca(unsigned long) */
+	fntype = build_function_type_list(void_type_node,
+				long_unsigned_type_node, NULL_TREE);
+	check_function_decl = build_fn_decl(check_function, fntype);
+	DECL_ASSEMBLER_NAME(check_function_decl); /* for LTO */
+	TREE_PUBLIC(check_function_decl) = 1;
+	TREE_USED(check_function_decl) = 1;
+	DECL_EXTERNAL(check_function_decl) = 1;
+	DECL_ARTIFICIAL(check_function_decl) = 1;
+	DECL_PRESERVE_P(check_function_decl) = 1;
+}
+
+/*
+ * Pass gate function is a predicate function that gets executed before the
+ * corresponding pass. If the return value is 'true' the pass gets executed,
+ * otherwise, it is skipped.
+ */
+static bool stackleak_instrument_gate(void)
+{
+	return stackleak_gate();
+}
+
+#define PASS_NAME stackleak_instrument
+#define PROPERTIES_REQUIRED PROP_gimple_leh | PROP_cfg
+#define TODO_FLAGS_START TODO_verify_ssa | TODO_verify_flow | TODO_verify_stmts
+#define TODO_FLAGS_FINISH TODO_verify_ssa | TODO_verify_stmts | TODO_dump_func \
+			| TODO_update_ssa | TODO_rebuild_cgraph_edges
+#include "gcc-generate-gimple-pass.h"
+
+static bool stackleak_cleanup_gate(void)
+{
+	return stackleak_gate();
+}
+
+#define PASS_NAME stackleak_cleanup
+#define TODO_FLAGS_FINISH TODO_dump_func
+#include "gcc-generate-rtl-pass.h"
+
+/*
+ * Every gcc plugin exports a plugin_init() function that is called right
+ * after the plugin is loaded. This function is responsible for registering
+ * the plugin callbacks and doing other required initialization.
+ */
+__visible int plugin_init(struct plugin_name_args *plugin_info,
+			  struct plugin_gcc_version *version)
+{
+	const char * const plugin_name = plugin_info->base_name;
+	const int argc = plugin_info->argc;
+	const struct plugin_argument * const argv = plugin_info->argv;
+	int i = 0;
+
+	/* Extra GGC root tables describing our GTY-ed data */
+	static const struct ggc_root_tab gt_ggc_r_gt_stackleak[] = {
+		{
+			.base = &track_function_decl,
+			.nelt = 1,
+			.stride = sizeof(track_function_decl),
+			.cb = &gt_ggc_mx_tree_node,
+			.pchw = &gt_pch_nx_tree_node
+		},
+		{
+			.base = &check_function_decl,
+			.nelt = 1,
+			.stride = sizeof(check_function_decl),
+			.cb = &gt_ggc_mx_tree_node,
+			.pchw = &gt_pch_nx_tree_node
+		},
+		LAST_GGC_ROOT_TAB
+	};
+
+	/*
+	 * The stackleak_instrument pass should be executed before the
+	 * "optimized" pass, which is the control flow graph cleanup that is
+	 * performed just before expanding gcc trees to the RTL. In former
+	 * versions of the plugin this new pass was inserted before the
+	 * "tree_profile" pass, which is currently called "profile".
+	 */
+	PASS_INFO(stackleak_instrument, "optimized", 1,
+						PASS_POS_INSERT_BEFORE);
+
+	/*
+	 * The stackleak_cleanup pass should be executed after the
+	 * "reload" pass, when the stack frame size is final.
+	 */
+	PASS_INFO(stackleak_cleanup, "reload", 1, PASS_POS_INSERT_AFTER);
+
+	if (!plugin_default_version_check(version, &gcc_version)) {
+		error(G_("incompatible gcc/plugin versions"));
+		return 1;
+	}
+
+	/* Parse the plugin arguments */
+	for (i = 0; i < argc; i++) {
+		if (!strcmp(argv[i].key, "disable"))
+			return 0;
+
+		if (!strcmp(argv[i].key, "track-min-size")) {
+			if (!argv[i].value) {
+				error(G_("no value supplied for option '-fplugin-arg-%s-%s'"),
+					plugin_name, argv[i].key);
+				return 1;
+			}
+
+			track_frame_size = atoi(argv[i].value);
+			if (track_frame_size < 0) {
+				error(G_("invalid option argument '-fplugin-arg-%s-%s=%s'"),
+					plugin_name, argv[i].key, argv[i].value);
+				return 1;
+			}
+		} else {
+			error(G_("unknown option '-fplugin-arg-%s-%s'"),
+					plugin_name, argv[i].key);
+			return 1;
+		}
+	}
+
+	/* Give the information about the plugin */
+	register_callback(plugin_name, PLUGIN_INFO, NULL,
+						&stackleak_plugin_info);
+
+	/* Register to be called before processing a translation unit */
+	register_callback(plugin_name, PLUGIN_START_UNIT,
+					&stackleak_start_unit, NULL);
+
+	/* Register an extra GCC garbage collector (GGC) root table */
+	register_callback(plugin_name, PLUGIN_REGISTER_GGC_ROOTS, NULL,
+					(void *)&gt_ggc_r_gt_stackleak);
+
+	/*
+	 * Hook into the Pass Manager to register new gcc passes.
+	 *
+	 * The stack frame size info is available only at the last RTL pass,
+	 * when it's too late to insert complex code like a function call.
+	 * So we register two gcc passes to instrument every function at first
+	 * and remove the unneeded instrumentation later.
+	 */
+	register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL,
+					&stackleak_instrument_pass_info);
+	register_callback(plugin_name, PLUGIN_PASS_MANAGER_SETUP, NULL,
+					&stackleak_cleanup_pass_info);
+
+	return 0;
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-03 20:00 [PATCH RFC v9 0/7] Introduce the STACKLEAK feature and a test for it Alexander Popov
                   ` (2 preceding siblings ...)
  2018-03-03 20:00 ` [PATCH RFC v9 3/7] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack Alexander Popov
@ 2018-03-03 20:00 ` Alexander Popov
  2018-03-05 19:40   ` Dave Hansen
  2018-03-03 20:00 ` [PATCH RFC v9 5/7] lkdtm: Add a test for STACKLEAK Alexander Popov
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 59+ messages in thread
From: Alexander Popov @ 2018-03-03 20:00 UTC (permalink / raw)
  To: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Dave Hansen, Mathias Krause, Vikas Shivappa,
	Kyle Huey, Dmitry Safonov, Will Deacon, Arnd Bergmann, x86,
	linux-kernel, alex.popov

Make STACKLEAK erase kernel stack after ptrace/seccomp/auditing
not to leave any sensitive information on the stack for the syscall code.

This code is modified from Brad Spengler/PaX Team's code in the last
public patch of grsecurity/PaX based on our understanding of the code.
Changes or omissions from the original code are ours and don't reflect
the original grsecurity/PaX code.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 arch/x86/entry/common.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 74f6eee..b4be776 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -46,6 +46,12 @@ __visible inline void enter_from_user_mode(void)
 static inline void enter_from_user_mode(void) {}
 #endif
 
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+asmlinkage void erase_kstack(void);
+#else
+static void erase_kstack(void) {}
+#endif
+
 static void do_audit_syscall_entry(struct pt_regs *regs, u32 arch)
 {
 #ifdef CONFIG_X86_64
@@ -128,6 +134,7 @@ static long syscall_trace_enter(struct pt_regs *regs)
 
 	do_audit_syscall_entry(regs, arch);
 
+	erase_kstack();
 	return ret ?: regs->orig_ax;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFC v9 5/7] lkdtm: Add a test for STACKLEAK
  2018-03-03 20:00 [PATCH RFC v9 0/7] Introduce the STACKLEAK feature and a test for it Alexander Popov
                   ` (3 preceding siblings ...)
  2018-03-03 20:00 ` [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter() Alexander Popov
@ 2018-03-03 20:00 ` Alexander Popov
  2018-03-03 20:00 ` [PATCH RFC v9 6/7] fs/proc: Show STACKLEAK metrics in the /proc file system Alexander Popov
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 59+ messages in thread
From: Alexander Popov @ 2018-03-03 20:00 UTC (permalink / raw)
  To: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Dave Hansen, Mathias Krause, Vikas Shivappa,
	Kyle Huey, Dmitry Safonov, Will Deacon, Arnd Bergmann, x86,
	linux-kernel, alex.popov

Introduce two lkdtm tests for the STACKLEAK feature: STACKLEAK_ALLOCA
and STACKLEAK_DEEP_RECURSION. Both of them check that the current task
stack is properly erased (filled with STACKLEAK_POISON).

STACKLEAK_ALLOCA tests that:
 - check_alloca() allows alloca calls which don't exhaust the kernel stack;
 - alloca calls which exhaust/overflow the kernel stack hit BUG() in
    check_alloca().

STACKLEAK_DEEP_RECURSION tests that exhausting the current task stack
with a deep recursion is detected by CONFIG_VMAP_STACK (which is implied
by CONFIG_GCC_PLUGIN_STACKLEAK).

Signed-off-by: Tycho Andersen <tycho@tycho.ws>
Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 drivers/misc/Makefile          |   3 +
 drivers/misc/lkdtm.h           |   4 ++
 drivers/misc/lkdtm_core.c      |   2 +
 drivers/misc/lkdtm_stackleak.c | 136 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 145 insertions(+)
 create mode 100644 drivers/misc/lkdtm_stackleak.c

diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index c3c8624..2b11823 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -65,6 +65,9 @@ lkdtm-$(CONFIG_LKDTM)		+= lkdtm_perms.o
 lkdtm-$(CONFIG_LKDTM)		+= lkdtm_refcount.o
 lkdtm-$(CONFIG_LKDTM)		+= lkdtm_rodata_objcopy.o
 lkdtm-$(CONFIG_LKDTM)		+= lkdtm_usercopy.o
+lkdtm-$(CONFIG_LKDTM)		+= lkdtm_stackleak.o
+
+KASAN_SANITIZE_lkdtm_stackleak.o := n
 
 KCOV_INSTRUMENT_lkdtm_rodata.o	:= n
 
diff --git a/drivers/misc/lkdtm.h b/drivers/misc/lkdtm.h
index 9e513dc..4b2b8e3 100644
--- a/drivers/misc/lkdtm.h
+++ b/drivers/misc/lkdtm.h
@@ -83,4 +83,8 @@ void lkdtm_USERCOPY_STACK_FRAME_FROM(void);
 void lkdtm_USERCOPY_STACK_BEYOND(void);
 void lkdtm_USERCOPY_KERNEL(void);
 
+/* lkdtm_stackleak.c */
+void lkdtm_STACKLEAK_ALLOCA(void);
+void lkdtm_STACKLEAK_DEEP_RECURSION(void);
+
 #endif
diff --git a/drivers/misc/lkdtm_core.c b/drivers/misc/lkdtm_core.c
index 2154d1b..c37fd85 100644
--- a/drivers/misc/lkdtm_core.c
+++ b/drivers/misc/lkdtm_core.c
@@ -183,6 +183,8 @@ static const struct crashtype crashtypes[] = {
 	CRASHTYPE(USERCOPY_STACK_FRAME_FROM),
 	CRASHTYPE(USERCOPY_STACK_BEYOND),
 	CRASHTYPE(USERCOPY_KERNEL),
+	CRASHTYPE(STACKLEAK_ALLOCA),
+	CRASHTYPE(STACKLEAK_DEEP_RECURSION),
 };
 
 
diff --git a/drivers/misc/lkdtm_stackleak.c b/drivers/misc/lkdtm_stackleak.c
new file mode 100644
index 0000000..b1d2a9c
--- /dev/null
+++ b/drivers/misc/lkdtm_stackleak.c
@@ -0,0 +1,136 @@
+/*
+ * This code tests several aspects of the STACKLEAK feature:
+ *  - the current task stack is properly erased (filled with STACKLEAK_POISON);
+ *  - check_alloca() allows alloca calls which don't exhaust the kernel stack;
+ *  - alloca calls which exhaust/overflow the kernel stack hit BUG() in
+ *     check_alloca();
+ *  - exhausting the current task stack with a deep recursion is detected by
+ *     CONFIG_VMAP_STACK (which is implied by CONFIG_GCC_PLUGIN_STACKLEAK).
+ *
+ * Authors:
+ *   Tycho Andersen <tycho@tycho.ws>
+ *   Alexander Popov <alex.popov@linux.com>
+ */
+
+#include "lkdtm.h"
+#include <linux/sched.h>
+#include <linux/compiler.h>
+
+#ifndef CONFIG_GCC_PLUGIN_STACKLEAK
+# define STACKLEAK_POISON -0xBEEF
+# define CONFIG_STACKLEAK_TRACK_MIN_SIZE 100
+#endif
+
+static noinline bool stack_is_erased(void)
+{
+	unsigned long *sp, left, found, i;
+
+	/*
+	 * For the details about the alignment of the poison values, see
+	 * the comment in track_stack().
+	 */
+	sp = PTR_ALIGN(&i, sizeof(unsigned long));
+
+	left = ((unsigned long)sp & (THREAD_SIZE - 1)) / sizeof(unsigned long);
+	sp--;
+
+	/*
+	 * Two unsigned long ints at the bottom of the thread stack are
+	 * reserved and not poisoned.
+	 */
+	if (left <= 2)
+		return false;
+
+	left -= 2;
+	pr_info("checking unused part of the thread stack (%lu bytes)...\n",
+					left * sizeof(unsigned long));
+
+	/* Search for 17 poison values in a row (like erase_kstack() does) */
+	for (i = 0, found = 0; i < left && found < 17; i++) {
+		if (*(sp - i) == STACKLEAK_POISON)
+			found++;
+		else
+			found = 0;
+	}
+
+	if (found < 17) {
+		pr_err("FAIL: thread stack is not erased (checked %lu bytes)\n",
+						i * sizeof(unsigned long));
+		return false;
+	}
+
+	pr_info("first %lu bytes are unpoisoned\n",
+				(i - found) * sizeof(unsigned long));
+
+	/* The rest of thread stack should be erased */
+	for (; i < left; i++) {
+		if (*(sp - i) != STACKLEAK_POISON) {
+			pr_err("FAIL: thread stack is NOT properly erased\n");
+			return false;
+		}
+	}
+
+	pr_info("the rest of the thread stack is properly erased\n");
+	return true;
+}
+
+static noinline void do_alloca(unsigned long size)
+{
+	char buf[size];
+
+	/* So this doesn't get inlined or optimized out */
+	snprintf(buf, size, "testing alloca...\n");
+}
+
+void lkdtm_STACKLEAK_ALLOCA(void)
+{
+	unsigned long left = (unsigned long)&left & (THREAD_SIZE - 1);
+
+	if (!stack_is_erased())
+		return;
+
+	/* Try a small alloca to see if it works */
+	pr_info("try a small alloca of 16 bytes...\n");
+	do_alloca(16);
+	pr_info("small alloca is successful\n");
+
+	/* Try to hit the BUG() in check_alloca() */
+	pr_info("try a large alloca of %lu bytes (stack overflow)...\n", left);
+	do_alloca(left);
+	pr_err("FAIL: large alloca overstepped the thread stack boundary\n");
+}
+
+/*
+ * The stack frame size of recursion() is bigger than the
+ * CONFIG_STACKLEAK_TRACK_MIN_SIZE, hence that function is instrumented
+ * by the STACKLEAK gcc plugin and it calls track_stack() at the beginning.
+ */
+static noinline unsigned long recursion(unsigned long prev_sp)
+{
+	char buf[CONFIG_STACKLEAK_TRACK_MIN_SIZE + 42];
+	unsigned long sp = (unsigned long)&sp;
+
+	snprintf(buf, sizeof(buf), "testing deep recursion...\n");
+
+	if (prev_sp < sp + THREAD_SIZE)
+		sp = recursion(prev_sp);
+
+	return sp;
+}
+
+void lkdtm_STACKLEAK_DEEP_RECURSION(void)
+{
+	unsigned long sp = (unsigned long)&sp;
+
+	if (!stack_is_erased())
+		return;
+
+	/*
+	 * Exhaust the thread stack with a deep recursion. It should hit the
+	 * guard page provided by CONFIG_VMAP_STACK (which is implied by
+	 * CONFIG_GCC_PLUGIN_STACKLEAK).
+	 */
+	pr_info("try to exhaust the thread stack with a deep recursion...\n");
+	pr_err("FAIL: thread stack exhaustion (%lu bytes) is not detected\n",
+							sp - recursion(sp));
+}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFC v9 6/7] fs/proc: Show STACKLEAK metrics in the /proc file system
  2018-03-03 20:00 [PATCH RFC v9 0/7] Introduce the STACKLEAK feature and a test for it Alexander Popov
                   ` (4 preceding siblings ...)
  2018-03-03 20:00 ` [PATCH RFC v9 5/7] lkdtm: Add a test for STACKLEAK Alexander Popov
@ 2018-03-03 20:00 ` Alexander Popov
  2018-03-03 20:00 ` [PATCH RFC v9 7/7] doc: self-protection: Add information about STACKLEAK feature Alexander Popov
  2018-03-05 19:34 ` [PATCH RFC v9 0/7] Introduce the STACKLEAK feature and a test for it Kees Cook
  7 siblings, 0 replies; 59+ messages in thread
From: Alexander Popov @ 2018-03-03 20:00 UTC (permalink / raw)
  To: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Dave Hansen, Mathias Krause, Vikas Shivappa,
	Kyle Huey, Dmitry Safonov, Will Deacon, Arnd Bergmann, x86,
	linux-kernel, alex.popov

Introduce CONFIG_STACKLEAK_METRICS providing STACKLEAK information about
tasks via the /proc file system. In particular, /proc/<pid>/stack_depth
shows the maximum kernel stack consumption for the current and previous
syscalls. Although this information is not precise, it  can be useful for
estimating the STACKLEAK performance impact for your workloads.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 arch/Kconfig                     | 12 ++++++++++++
 arch/x86/entry/entry_32.S        |  4 ++++
 arch/x86/entry/entry_64.S        |  4 ++++
 arch/x86/include/asm/processor.h |  3 +++
 arch/x86/kernel/asm-offsets.c    |  3 +++
 arch/x86/kernel/process_32.c     |  3 +++
 arch/x86/kernel/process_64.c     |  3 +++
 fs/proc/base.c                   | 18 ++++++++++++++++++
 8 files changed, 50 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index a4a8fba..42ebfb9 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -572,6 +572,18 @@ config STACKLEAK_TRACK_MIN_SIZE
 	  frame size greater than or equal to this parameter.
 	  If unsure, leave the default value 100.
 
+config STACKLEAK_METRICS
+	bool "Show STACKLEAK metrics in the /proc file system"
+	depends on GCC_PLUGIN_STACKLEAK
+	depends on PROC_FS
+	help
+	  If this is set, STACKLEAK metrics for every task are available in
+	  the /proc file system. In particular, /proc/<pid>/stack_depth
+	  shows the maximum kernel stack consumption for the current and
+	  previous syscalls. Although this information is not precise, it
+	  can be useful for estimating the STACKLEAK performance impact for
+	  your workloads.
+
 config HAVE_CC_STACKPROTECTOR
 	bool
 	help
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index 068dde6..f5236b3 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -134,6 +134,10 @@ ENTRY(erase_kstack)
 	mov	%esp, %ecx
 	sub	%edi, %ecx
 
+#ifdef CONFIG_STACKLEAK_METRICS
+	mov	%edi, TASK_prev_lowest_stack(%ebp)
+#endif
+
 	cmp	$THREAD_SIZE_asm, %ecx
 	jb	.Lgood_counter
 	ud2
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 9b360f8..c82cdbd 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -119,6 +119,10 @@ ENTRY(erase_kstack)
 	 */
 	or	$2 * 8, %rdi
 
+#ifdef CONFIG_STACKLEAK_METRICS
+	mov	%rdi, TASK_prev_lowest_stack(%r11)
+#endif
+
 	/*
 	 * Check whether we are on the thread stack to prepare the counter
 	 * for stack poisoning.
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 0c87813..bca1074 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -496,6 +496,9 @@ struct thread_struct {
 
 #ifdef CONFIG_GCC_PLUGIN_STACKLEAK
 	unsigned long		lowest_stack;
+# ifdef CONFIG_STACKLEAK_METRICS
+	unsigned long		prev_lowest_stack;
+# endif
 #endif
 
 	unsigned int		sig_on_uaccess_err:1;
diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index ef5d260..f48197a 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -41,6 +41,9 @@ void common(void) {
 	OFFSET(TASK_addr_limit, task_struct, thread.addr_limit);
 #ifdef CONFIG_GCC_PLUGIN_STACKLEAK
 	OFFSET(TASK_lowest_stack, task_struct, thread.lowest_stack);
+# ifdef CONFIG_STACKLEAK_METRICS
+	OFFSET(TASK_prev_lowest_stack, task_struct, thread.prev_lowest_stack);
+# endif
 #endif
 
 	BLANK();
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 6d256ab..48993fe 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -139,6 +139,9 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
 #ifdef CONFIG_GCC_PLUGIN_STACKLEAK
 	p->thread.lowest_stack = (unsigned long)task_stack_page(p) +
 						2 * sizeof(unsigned long);
+# ifdef CONFIG_STACKLEAK_METRICS
+	p->thread.prev_lowest_stack = p->thread.lowest_stack;
+# endif
 #endif
 
 	if (unlikely(p->flags & PF_KTHREAD)) {
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index 6dc55f6..0355fba 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -284,6 +284,9 @@ int copy_thread_tls(unsigned long clone_flags, unsigned long sp,
 #ifdef CONFIG_GCC_PLUGIN_STACKLEAK
 	p->thread.lowest_stack = (unsigned long)task_stack_page(p) +
 						2 * sizeof(unsigned long);
+# ifdef CONFIG_STACKLEAK_METRICS
+	p->thread.prev_lowest_stack = p->thread.lowest_stack;
+# endif
 #endif
 
 	savesegment(gs, p->thread.gsindex);
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 9298324..6a7f9bd 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2914,6 +2914,21 @@ static int proc_pid_patch_state(struct seq_file *m, struct pid_namespace *ns,
 }
 #endif /* CONFIG_LIVEPATCH */
 
+#ifdef CONFIG_STACKLEAK_METRICS
+static int proc_stack_depth(struct seq_file *m, struct pid_namespace *ns,
+				struct pid *pid, struct task_struct *task)
+{
+	unsigned long prev_depth = THREAD_SIZE -
+			(task->thread.prev_lowest_stack & (THREAD_SIZE - 1));
+	unsigned long depth = THREAD_SIZE -
+			(task->thread.lowest_stack & (THREAD_SIZE - 1));
+
+	seq_printf(m, "previous stack depth: %lu\nstack depth: %lu\n",
+							prev_depth, depth);
+	return 0;
+}
+#endif /* CONFIG_STACKLEAK_METRICS */
+
 /*
  * Thread groups
  */
@@ -3018,6 +3033,9 @@ static const struct pid_entry tgid_base_stuff[] = {
 #ifdef CONFIG_LIVEPATCH
 	ONE("patch_state",  S_IRUSR, proc_pid_patch_state),
 #endif
+#ifdef CONFIG_STACKLEAK_METRICS
+	ONE("stack_depth", S_IRUGO, proc_stack_depth),
+#endif
 };
 
 static int proc_tgid_base_readdir(struct file *file, struct dir_context *ctx)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH RFC v9 7/7] doc: self-protection: Add information about STACKLEAK feature
  2018-03-03 20:00 [PATCH RFC v9 0/7] Introduce the STACKLEAK feature and a test for it Alexander Popov
                   ` (5 preceding siblings ...)
  2018-03-03 20:00 ` [PATCH RFC v9 6/7] fs/proc: Show STACKLEAK metrics in the /proc file system Alexander Popov
@ 2018-03-03 20:00 ` Alexander Popov
  2018-03-05 19:34 ` [PATCH RFC v9 0/7] Introduce the STACKLEAK feature and a test for it Kees Cook
  7 siblings, 0 replies; 59+ messages in thread
From: Alexander Popov @ 2018-03-03 20:00 UTC (permalink / raw)
  To: kernel-hardening, Kees Cook, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Dave Hansen, Mathias Krause, Vikas Shivappa,
	Kyle Huey, Dmitry Safonov, Will Deacon, Arnd Bergmann, x86,
	linux-kernel, alex.popov

Add information about STACKLEAK feature to "Stack depth overflow" and
"Memory poisoning" sections of self-protection.rst.

Signed-off-by: Alexander Popov <alex.popov@linux.com>
---
 Documentation/security/self-protection.rst | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/Documentation/security/self-protection.rst b/Documentation/security/self-protection.rst
index 0f53826..b685f18 100644
--- a/Documentation/security/self-protection.rst
+++ b/Documentation/security/self-protection.rst
@@ -165,10 +165,15 @@ Stack depth overflow
 A less well understood attack is using a bug that triggers the
 kernel to consume stack memory with deep function calls or large stack
 allocations. With this attack it is possible to write beyond the end of
-the kernel's preallocated stack space and into sensitive structures. Two
-important changes need to be made for better protections: moving the
-sensitive thread_info structure elsewhere, and adding a faulting memory
-hole at the bottom of the stack to catch these overflows.
+the kernel's preallocated stack space and into sensitive structures.
+The combination of the following measures gives better protection:
+
+* moving the sensitive thread_info structure off the stack
+  (``CONFIG_THREAD_INFO_IN_TASK``);
+* adding a faulting memory hole at the bottom of the stack to catch
+  these overflows (``CONFIG_VMAP_STACK``);
+* runtime checking that alloca() calls don't overstep the stack boundary
+  (``CONFIG_GCC_PLUGIN_STACKLEAK``).
 
 Heap memory integrity
 ---------------------
@@ -302,11 +307,11 @@ sure structure holes are cleared.
 Memory poisoning
 ----------------
 
-When releasing memory, it is best to poison the contents (clear stack on
-syscall return, wipe heap memory on a free), to avoid reuse attacks that
-rely on the old contents of memory. This frustrates many uninitialized
-variable attacks, stack content exposures, heap content exposures, and
-use-after-free attacks.
+When releasing memory, it is best to poison the contents, to avoid reuse
+attacks that rely on the old contents of memory. E.g., clear stack on a
+syscall return (``CONFIG_GCC_PLUGIN_STACKLEAK``), wipe heap memory on a
+free. This frustrates many uninitialized variable attacks, stack content
+exposures, heap content exposures, and use-after-free attacks.
 
 Destination tracking
 --------------------
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 2/7] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-03-03 20:00 ` [PATCH RFC v9 2/7] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls Alexander Popov
@ 2018-03-05 16:41   ` Dave Hansen
  2018-03-05 19:43     ` Laura Abbott
  0 siblings, 1 reply; 59+ messages in thread
From: Dave Hansen @ 2018-03-05 16:41 UTC (permalink / raw)
  To: Alexander Popov, kernel-hardening, Kees Cook, PaX Team,
	Brad Spengler, Ingo Molnar, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Ard Biesheuvel, Borislav Petkov,
	Richard Sandiford, Thomas Gleixner, H . Peter Anvin,
	Peter Zijlstra, Dmitry V . Levin, Emese Revfy, Jonathan Corbet,
	Andrey Ryabinin, Kirill A . Shutemov, Thomas Garnier,
	Andrew Morton, Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, x86, linux-kernel

On 03/03/2018 12:00 PM, Alexander Popov wrote:
>  Documentation/x86/x86_64/mm.txt  |   2 +
>  arch/Kconfig                     |  27 ++++++++++
>  arch/x86/Kconfig                 |   1 +
>  arch/x86/entry/entry_32.S        |  88 +++++++++++++++++++++++++++++++
>  arch/x86/entry/entry_64.S        | 108 +++++++++++++++++++++++++++++++++++++++
>  arch/x86/entry/entry_64_compat.S |  11 ++++

This is a *lot* of assembly.  I wonder if you tried at all to get more
of this into C or whether you just inherited the assembly from the
original code?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 0/7] Introduce the STACKLEAK feature and a test for it
  2018-03-03 20:00 [PATCH RFC v9 0/7] Introduce the STACKLEAK feature and a test for it Alexander Popov
                   ` (6 preceding siblings ...)
  2018-03-03 20:00 ` [PATCH RFC v9 7/7] doc: self-protection: Add information about STACKLEAK feature Alexander Popov
@ 2018-03-05 19:34 ` Kees Cook
  2018-03-05 19:42   ` Dave Hansen
  7 siblings, 1 reply; 59+ messages in thread
From: Kees Cook @ 2018-03-05 19:34 UTC (permalink / raw)
  To: Alexander Popov, Dave Hansen, Borislav Petkov, Andy Lutomirski
  Cc: Kernel Hardening, PaX Team, Brad Spengler, Ingo Molnar,
	Tycho Andersen, Laura Abbott, Mark Rutland, Ard Biesheuvel,
	Richard Sandiford, Thomas Gleixner, H . Peter Anvin,
	Peter Zijlstra, Dmitry V . Levin, Emese Revfy, Jonathan Corbet,
	Andrey Ryabinin, Kirill A . Shutemov, Thomas Garnier,
	Andrew Morton, Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

On Sat, Mar 3, 2018 at 12:00 PM, Alexander Popov <alex.popov@linux.com> wrote:
> This is the 9th version of the patch series introducing STACKLEAK to the
> mainline kernel. STACKLEAK is a security feature developed by Grsecurity/PaX
> (kudos to them), which:
>  - reduces the information that can be revealed through kernel stack leak bugs;
>  - blocks some uninitialized stack variable attacks (e.g. CVE-2017-17712,
>     CVE-2010-2963);
>  - introduces some runtime checks for kernel stack overflow detection.

Thanks for continuing to chip away at this! I wonder if it's time to
drop the "RFC" part of this? It seems like this should be ready to
land pretty soon. I can start carrying this in the kspp -next tree,
for example. I'd like to get some sign-off from x86, though.

Boris, Andy, and Dave (Hansen), you've all looked at this; would you
be willing to give an Ack on the x86 parts? (Though I do now see a new
comment from Dave was just sent.) And if not, what changes would you
like to see?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-03 20:00 ` [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter() Alexander Popov
@ 2018-03-05 19:40   ` Dave Hansen
  2018-03-05 20:06     ` Kees Cook
  0 siblings, 1 reply; 59+ messages in thread
From: Dave Hansen @ 2018-03-05 19:40 UTC (permalink / raw)
  To: Alexander Popov, kernel-hardening, Kees Cook, PaX Team,
	Brad Spengler, Ingo Molnar, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Ard Biesheuvel, Borislav Petkov,
	Richard Sandiford, Thomas Gleixner, H . Peter Anvin,
	Peter Zijlstra, Dmitry V . Levin, Emese Revfy, Jonathan Corbet,
	Andrey Ryabinin, Kirill A . Shutemov, Thomas Garnier,
	Andrew Morton, Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, x86, linux-kernel

On 03/03/2018 12:00 PM, Alexander Popov wrote:
> @@ -128,6 +134,7 @@ static long syscall_trace_enter(struct pt_regs *regs)
>  
>  	do_audit_syscall_entry(regs, arch);
>  
> +	erase_kstack();
>  	return ret ?: regs->orig_ax;
>  }

This seems like an odd place to be clearing the stack.  Why was it done her?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 0/7] Introduce the STACKLEAK feature and a test for it
  2018-03-05 19:34 ` [PATCH RFC v9 0/7] Introduce the STACKLEAK feature and a test for it Kees Cook
@ 2018-03-05 19:42   ` Dave Hansen
  2018-03-05 20:02     ` Kees Cook
  0 siblings, 1 reply; 59+ messages in thread
From: Dave Hansen @ 2018-03-05 19:42 UTC (permalink / raw)
  To: Kees Cook, Alexander Popov, Borislav Petkov, Andy Lutomirski
  Cc: Kernel Hardening, PaX Team, Brad Spengler, Ingo Molnar,
	Tycho Andersen, Laura Abbott, Mark Rutland, Ard Biesheuvel,
	Richard Sandiford, Thomas Gleixner, H . Peter Anvin,
	Peter Zijlstra, Dmitry V . Levin, Emese Revfy, Jonathan Corbet,
	Andrey Ryabinin, Kirill A . Shutemov, Thomas Garnier,
	Andrew Morton, Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

On 03/05/2018 11:34 AM, Kees Cook wrote:
> Boris, Andy, and Dave (Hansen), you've all looked at this; would you
> be willing to give an Ack on the x86 parts? (Though I do now see a new
> comment from Dave was just sent.) And if not, what changes would you
> like to see?

I think it could definitely use another cleanup and de-#ifdef'ing pass.
It seems to have inherited the style from the original code and it's a
bit more than we're used to in mainline.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 2/7] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-03-05 16:41   ` Dave Hansen
@ 2018-03-05 19:43     ` Laura Abbott
  2018-03-05 19:50       ` Dave Hansen
  2018-03-05 20:25       ` Peter Zijlstra
  0 siblings, 2 replies; 59+ messages in thread
From: Laura Abbott @ 2018-03-05 19:43 UTC (permalink / raw)
  To: Dave Hansen, Alexander Popov, kernel-hardening, Kees Cook,
	PaX Team, Brad Spengler, Ingo Molnar, Andy Lutomirski,
	Tycho Andersen, Mark Rutland, Ard Biesheuvel, Borislav Petkov,
	Richard Sandiford, Thomas Gleixner, H . Peter Anvin,
	Peter Zijlstra, Dmitry V . Levin, Emese Revfy, Jonathan Corbet,
	Andrey Ryabinin, Kirill A . Shutemov, Thomas Garnier,
	Andrew Morton, Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, x86, linux-kernel

On 03/05/2018 08:41 AM, Dave Hansen wrote:
> On 03/03/2018 12:00 PM, Alexander Popov wrote:
>>   Documentation/x86/x86_64/mm.txt  |   2 +
>>   arch/Kconfig                     |  27 ++++++++++
>>   arch/x86/Kconfig                 |   1 +
>>   arch/x86/entry/entry_32.S        |  88 +++++++++++++++++++++++++++++++
>>   arch/x86/entry/entry_64.S        | 108 +++++++++++++++++++++++++++++++++++++++
>>   arch/x86/entry/entry_64_compat.S |  11 ++++
> 
> This is a *lot* of assembly.  I wonder if you tried at all to get more
> of this into C or whether you just inherited the assembly from the
> original code?
> 

This came up previously http://www.openwall.com/lists/kernel-hardening/2017/10/23/5
there were concerns about trusting C to do the right thing as well as
speed.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 2/7] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-03-05 19:43     ` Laura Abbott
@ 2018-03-05 19:50       ` Dave Hansen
  2018-03-05 20:25       ` Peter Zijlstra
  1 sibling, 0 replies; 59+ messages in thread
From: Dave Hansen @ 2018-03-05 19:50 UTC (permalink / raw)
  To: Laura Abbott, Alexander Popov, kernel-hardening, Kees Cook,
	PaX Team, Brad Spengler, Ingo Molnar, Andy Lutomirski,
	Tycho Andersen, Mark Rutland, Ard Biesheuvel, Borislav Petkov,
	Richard Sandiford, Thomas Gleixner, H . Peter Anvin,
	Peter Zijlstra, Dmitry V . Levin, Emese Revfy, Jonathan Corbet,
	Andrey Ryabinin, Kirill A . Shutemov, Thomas Garnier,
	Andrew Morton, Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, x86, linux-kernel

On 03/05/2018 11:43 AM, Laura Abbott wrote:
> On 03/05/2018 08:41 AM, Dave Hansen wrote:
>> On 03/03/2018 12:00 PM, Alexander Popov wrote:
>>>   Documentation/x86/x86_64/mm.txt  |   2 +
>>>   arch/Kconfig                     |  27 ++++++++++
>>>   arch/x86/Kconfig                 |   1 +
>>>   arch/x86/entry/entry_32.S        |  88 +++++++++++++++++++++++++++++++
>>>   arch/x86/entry/entry_64.S        | 108
>>> +++++++++++++++++++++++++++++++++++++++
>>>   arch/x86/entry/entry_64_compat.S |  11 ++++
>>
>> This is a *lot* of assembly.  I wonder if you tried at all to get more
>> of this into C or whether you just inherited the assembly from the
>> original code?
> 
> This came up previously
> http://www.openwall.com/lists/kernel-hardening/2017/10/23/5
> there were concerns about trusting C to do the right thing as well as
> speed.

I'm really just curious if anyone tried it and what tradeoffs were made.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 0/7] Introduce the STACKLEAK feature and a test for it
  2018-03-05 19:42   ` Dave Hansen
@ 2018-03-05 20:02     ` Kees Cook
  0 siblings, 0 replies; 59+ messages in thread
From: Kees Cook @ 2018-03-05 20:02 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Alexander Popov, Borislav Petkov, Andy Lutomirski,
	Kernel Hardening, PaX Team, Brad Spengler, Ingo Molnar,
	Tycho Andersen, Laura Abbott, Mark Rutland, Ard Biesheuvel,
	Richard Sandiford, Thomas Gleixner, H . Peter Anvin,
	Peter Zijlstra, Dmitry V . Levin, Emese Revfy, Jonathan Corbet,
	Andrey Ryabinin, Kirill A . Shutemov, Thomas Garnier,
	Andrew Morton, Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

On Mon, Mar 5, 2018 at 11:42 AM, Dave Hansen
<dave.hansen@linux.intel.com> wrote:
> On 03/05/2018 11:34 AM, Kees Cook wrote:
>> Boris, Andy, and Dave (Hansen), you've all looked at this; would you
>> be willing to give an Ack on the x86 parts? (Though I do now see a new
>> comment from Dave was just sent.) And if not, what changes would you
>> like to see?
>
> I think it could definitely use another cleanup and de-#ifdef'ing pass.
> It seems to have inherited the style from the original code and it's a
> bit more than we're used to in mainline.

There are a few places it could be minimized, that's true. It looked
like it might not be worth it, but the places I see are:

include/linux/compiler.h:
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+/* Poison value points to the unused hole in the virtual memory map */
+# define STACKLEAK_POISON -0xBEEF
+# define STACKLEAK_POISON_CHECK_DEPTH 128
+#endif

This doesn't need an #ifdef wrapper...


arch/x86/kernel/process_64.c and arch/x86/kernel/process_32.c:
+#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
+       p->thread.lowest_stack = (unsigned long)task_stack_page(p) +
+                                               2 * sizeof(unsigned long);
+#endif

This could be made into a helper function, maybe, in processor.h? Like:

#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
# define record_lowest_stack(p) do { \
        p->thread.lowest_stack = (unsigned long)task_stack_page(p) +
                                                  2 * sizeof(unsigned long);
    } while (0)
#else
# define save_lowest_stack(p) do { } while (0)
#endif

And the uses in process_*.c would be:

    save_lowest_stack(p);

?


And "fs/proc: Show STACKLEAK metrics in the /proc file system" could
maybe be adjusted too?

It doesn't seem like a lot of savings, but what do you think?

One new thing did pop out at me in this review, track_stack() likely
shouldn't live in fs/exec.c. It has nothing to do with exec(). There
aren't a lot of good places, but maybe a better place would be
mm/util.c. (A whole new source file seems like overkill.)

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-05 19:40   ` Dave Hansen
@ 2018-03-05 20:06     ` Kees Cook
  2018-03-05 20:15       ` Linus Torvalds
  2018-03-05 20:26       ` Peter Zijlstra
  0 siblings, 2 replies; 59+ messages in thread
From: Kees Cook @ 2018-03-05 20:06 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Alexander Popov, Kernel Hardening, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

On Mon, Mar 5, 2018 at 11:40 AM, Dave Hansen
<dave.hansen@linux.intel.com> wrote:
> On 03/03/2018 12:00 PM, Alexander Popov wrote:
>> @@ -128,6 +134,7 @@ static long syscall_trace_enter(struct pt_regs *regs)
>>
>>       do_audit_syscall_entry(regs, arch);
>>
>> +     erase_kstack();
>>       return ret ?: regs->orig_ax;
>>  }
>
> This seems like an odd place to be clearing the stack.  Why was it done her?

Perhaps the commit log could be improved, but the idea is that the
audit work (ptrace, seccomp, etc), is happening before the syscall
code starts running, and it has therefore written to the stack (that
used to be cleared on last exit). This retains the clear stack state
even in the face of ptrace-ish work happening before the syscall
proper starts.

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-05 20:06     ` Kees Cook
@ 2018-03-05 20:15       ` Linus Torvalds
  2018-03-05 21:02         ` Alexander Popov
  2018-03-05 21:02         ` Kees Cook
  2018-03-05 20:26       ` Peter Zijlstra
  1 sibling, 2 replies; 59+ messages in thread
From: Linus Torvalds @ 2018-03-05 20:15 UTC (permalink / raw)
  To: Kees Cook
  Cc: Dave Hansen, Alexander Popov, Kernel Hardening, PaX Team,
	Brad Spengler, Ingo Molnar, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Ard Biesheuvel, Borislav Petkov,
	Richard Sandiford, Thomas Gleixner, H . Peter Anvin,
	Peter Zijlstra, Dmitry V . Levin, Emese Revfy, Jonathan Corbet,
	Andrey Ryabinin, Kirill A . Shutemov, Thomas Garnier,
	Andrew Morton, Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

This is the first I see of any of this, it was apparently not actually
posted to lkml or anything like that.

Honestly, what I see just makes me go "this is security-masturbation".

It doesn't actually seem to help *find* bugs at all. As such, it's
another "paper over and forget" thing that just adds fairly high
overhead when it's enabled.

I'm NAK'ing it sight-unseen (see above) just because I'm tired of
these kinds of pointless things that don't actually strive to improve
on the kernel, just add more and more overhead for nebulous "things
may happen", and that just make the code uglier.

Why wasn't it even posted to lkml?

And why isn't the focus of security people on tools to _analyse_ and
find problems?

                  Linus

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 2/7] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-03-05 19:43     ` Laura Abbott
  2018-03-05 19:50       ` Dave Hansen
@ 2018-03-05 20:25       ` Peter Zijlstra
  2018-03-05 21:21         ` Alexander Popov
  2018-03-21 11:04         ` Alexander Popov
  1 sibling, 2 replies; 59+ messages in thread
From: Peter Zijlstra @ 2018-03-05 20:25 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Dave Hansen, Alexander Popov, kernel-hardening, Kees Cook,
	PaX Team, Brad Spengler, Ingo Molnar, Andy Lutomirski,
	Tycho Andersen, Mark Rutland, Ard Biesheuvel, Borislav Petkov,
	Richard Sandiford, Thomas Gleixner, H . Peter Anvin,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, x86, linux-kernel

On Mon, Mar 05, 2018 at 11:43:19AM -0800, Laura Abbott wrote:
> On 03/05/2018 08:41 AM, Dave Hansen wrote:
> > On 03/03/2018 12:00 PM, Alexander Popov wrote:
> > >   Documentation/x86/x86_64/mm.txt  |   2 +
> > >   arch/Kconfig                     |  27 ++++++++++
> > >   arch/x86/Kconfig                 |   1 +
> > >   arch/x86/entry/entry_32.S        |  88 +++++++++++++++++++++++++++++++
> > >   arch/x86/entry/entry_64.S        | 108 +++++++++++++++++++++++++++++++++++++++
> > >   arch/x86/entry/entry_64_compat.S |  11 ++++
> > 
> > This is a *lot* of assembly.  I wonder if you tried at all to get more
> > of this into C or whether you just inherited the assembly from the
> > original code?
> > 
> 
> This came up previously http://www.openwall.com/lists/kernel-hardening/2017/10/23/5
> there were concerns about trusting C to do the right thing as well as
> speed.

And therefore the answer to this obvious question should've been part of
the Changelog :-)

Dave is last in a long line of people asking this same question.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-05 20:06     ` Kees Cook
  2018-03-05 20:15       ` Linus Torvalds
@ 2018-03-05 20:26       ` Peter Zijlstra
  1 sibling, 0 replies; 59+ messages in thread
From: Peter Zijlstra @ 2018-03-05 20:26 UTC (permalink / raw)
  To: Kees Cook
  Cc: Dave Hansen, Alexander Popov, Kernel Hardening, PaX Team,
	Brad Spengler, Ingo Molnar, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Ard Biesheuvel, Borislav Petkov,
	Richard Sandiford, Thomas Gleixner, H . Peter Anvin,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

On Mon, Mar 05, 2018 at 12:06:18PM -0800, Kees Cook wrote:
> On Mon, Mar 5, 2018 at 11:40 AM, Dave Hansen
> <dave.hansen@linux.intel.com> wrote:
> > On 03/03/2018 12:00 PM, Alexander Popov wrote:
> >> @@ -128,6 +134,7 @@ static long syscall_trace_enter(struct pt_regs *regs)
> >>
> >>       do_audit_syscall_entry(regs, arch);
> >>
> >> +     erase_kstack();
> >>       return ret ?: regs->orig_ax;
> >>  }
> >
> > This seems like an odd place to be clearing the stack.  Why was it done her?
> 
> Perhaps the commit log could be improved, but the idea is that the
> audit work (ptrace, seccomp, etc), is happening before the syscall
> code starts running, and it has therefore written to the stack (that
> used to be cleared on last exit). This retains the clear stack state
> even in the face of ptrace-ish work happening before the syscall
> proper starts.

I'd suggest a code-comment over a Changelog twiddle. The changelog bit
only helps now, that code comments helps us again in 6 motnhs time when
we've forgotten everything again.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-05 20:15       ` Linus Torvalds
@ 2018-03-05 21:02         ` Alexander Popov
  2018-03-05 21:02         ` Kees Cook
  1 sibling, 0 replies; 59+ messages in thread
From: Alexander Popov @ 2018-03-05 21:02 UTC (permalink / raw)
  To: Linus Torvalds, Kees Cook
  Cc: Dave Hansen, Kernel Hardening, PaX Team, Brad Spengler,
	Ingo Molnar, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

Hello Linus,

Thanks for your reply (despite some strong words).

On 05.03.2018 23:15, Linus Torvalds wrote:
> This is the first I see of any of this, it was apparently not actually
> posted to lkml or anything like that.

I described that below.

> Honestly, what I see just makes me go "this is security-masturbation".

Let me quote the cover letter of this patch series.

STACKLEAK (initially developed by PaX Team):
 - reduces the information that can be revealed through kernel stack leak bugs;
 - blocks some uninitialized stack variable attacks (e.g. CVE-2017-17712,
CVE-2010-2963);
 - introduces some runtime checks for kernel stack overflow detection. It blocks
the Stack Clash attack against the kernel.

So it seems to be a useful feature.

> It doesn't actually seem to help *find* bugs at all. As such, it's
> another "paper over and forget" thing that just adds fairly high
> overhead when it's enabled.

The cover letter also contains the information about the performance impact.
It's 0.6% on the compiling the kernel (with Ubuntu config) and approx 4% on a
very intensive hackbench test.

> I'm NAK'ing it sight-unseen (see above) just because I'm tired of
> these kinds of pointless things that don't actually strive to improve
> on the kernel, just add more and more overhead for nebulous "things
> may happen", and that just make the code uglier.
> 
> Why wasn't it even posted to lkml?

That's my mistake. I started to learn that feature 9 month ago, just before
Qualys published the Stack Clash attack (which is blocked by STACKLEAK). I sent
first WIP versions to a short list of people (and had a lot of feedback to work
with). But later unfortunately I didn't adjust the list of recipients.

That was not done intentionally.

> And why isn't the focus of security people on tools to _analyse_ and
> find problems?

You know, I like KASAN and kernel fuzzing as well:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=82f2341c94d270421f383641b7cd670e474db56b

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-05 20:15       ` Linus Torvalds
  2018-03-05 21:02         ` Alexander Popov
@ 2018-03-05 21:02         ` Kees Cook
  2018-03-05 21:40           ` Linus Torvalds
  2018-03-06  8:08           ` Ingo Molnar
  1 sibling, 2 replies; 59+ messages in thread
From: Kees Cook @ 2018-03-05 21:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Hansen, Alexander Popov, Kernel Hardening, PaX Team,
	Brad Spengler, Ingo Molnar, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Ard Biesheuvel, Borislav Petkov,
	Richard Sandiford, Thomas Gleixner, H . Peter Anvin,
	Peter Zijlstra, Dmitry V . Levin, Emese Revfy, Jonathan Corbet,
	Andrey Ryabinin, Kirill A . Shutemov, Thomas Garnier,
	Andrew Morton, Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

On Mon, Mar 5, 2018 at 12:15 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> This is the first I see of any of this, it was apparently not actually
> posted to lkml or anything like that.

This series was still RFC and narrowly focused (e.g. we recently had
compiler folks reviewing it). I'd say we're now to the point where
it's getting mature enough for larger review (I commented on the 0/n
patch to this end earlier today).

> It doesn't actually seem to help *find* bugs at all. As such, it's
> another "paper over and forget" thing that just adds fairly high
> overhead when it's enabled.

This specific class of flaw ("uninitialized" stack contents being used
or leaked) is already being looked for by tons of tools like KASan.
There are teams of people working on it, and they still haven't found
all the cases where these flaws appear.

Luckily we don't have to pick one or the other: we can continue to
look for bugs while defending against them ever happening in the first
place. We've already seen multiple cases of this just with the by-ref
initialization plugin, where a stack content leak goes away because we
asked the compiler to please initialize the memory for us when we
forgot to do it ourselves. Getting the compiler to help us seems like
the obviously correct thing to do, since we're using such a
memory-safety-unfriendly language. :)

> I'm NAK'ing it sight-unseen (see above) just because I'm tired of
> these kinds of pointless things that don't actually strive to improve
> on the kernel, just add more and more overhead for nebulous "things
> may happen", and that just make the code uglier.

I hope you'll reconsider. In this case, I think it does improve the
kernel, especially if we can gain more complete coverage through
native compiler options (instead of just a plugin). Right now, for
example, the kernel is littered with memset()s because the compiler
can't be trusted to correctly zero-init padding, etc. This is an
endless source of bugs, and this patch series provides a comprehensive
and fast way to keep the stack cleared. (See the benchmark I asked for
that compared this 1% "clear only what was used" against the 30% to
wipe the entire stack every time. The latter is obviously insane, and
the former is quite clever and kills an entire bug class.)

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 2/7] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-03-05 20:25       ` Peter Zijlstra
@ 2018-03-05 21:21         ` Alexander Popov
  2018-03-05 21:36           ` Kees Cook
  2018-03-21 11:04         ` Alexander Popov
  1 sibling, 1 reply; 59+ messages in thread
From: Alexander Popov @ 2018-03-05 21:21 UTC (permalink / raw)
  To: Peter Zijlstra, Laura Abbott
  Cc: Dave Hansen, kernel-hardening, Kees Cook, PaX Team,
	Brad Spengler, Ingo Molnar, Andy Lutomirski, Tycho Andersen,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Dmitry V . Levin, Emese Revfy,
	Jonathan Corbet, Andrey Ryabinin, Kirill A . Shutemov,
	Thomas Garnier, Andrew Morton, Alexei Starovoitov, Josef Bacik,
	Masami Hiramatsu, Nicholas Piggin, Al Viro, David S . Miller,
	Ding Tianhong, David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, x86, linux-kernel

On 05.03.2018 23:25, Peter Zijlstra wrote:
> On Mon, Mar 05, 2018 at 11:43:19AM -0800, Laura Abbott wrote:
>> On 03/05/2018 08:41 AM, Dave Hansen wrote:
>>> On 03/03/2018 12:00 PM, Alexander Popov wrote:
>>>>   Documentation/x86/x86_64/mm.txt  |   2 +
>>>>   arch/Kconfig                     |  27 ++++++++++
>>>>   arch/x86/Kconfig                 |   1 +
>>>>   arch/x86/entry/entry_32.S        |  88 +++++++++++++++++++++++++++++++
>>>>   arch/x86/entry/entry_64.S        | 108 +++++++++++++++++++++++++++++++++++++++
>>>>   arch/x86/entry/entry_64_compat.S |  11 ++++
>>>
>>> This is a *lot* of assembly.  I wonder if you tried at all to get more
>>> of this into C or whether you just inherited the assembly from the
>>> original code?
>>>
>>
>> This came up previously http://www.openwall.com/lists/kernel-hardening/2017/10/23/5
>> there were concerns about trusting C to do the right thing as well as
>> speed.
> 
> And therefore the answer to this obvious question should've been part of
> the Changelog :-)
> 
> Dave is last in a long line of people asking this same question.

Yes, actually the changelog in the cover letter contains that:

  After some experiments, kept the asm implementation of erase_kstack(),
  because it gives a full control over the stack for clearing it neatly
  and doesn't offend KASAN.

Moreover, later erase_kstack() on x86_64 became different from one on x86_32.

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 2/7] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-03-05 21:21         ` Alexander Popov
@ 2018-03-05 21:36           ` Kees Cook
  0 siblings, 0 replies; 59+ messages in thread
From: Kees Cook @ 2018-03-05 21:36 UTC (permalink / raw)
  To: Alexander Popov
  Cc: Peter Zijlstra, Laura Abbott, Dave Hansen, Kernel Hardening,
	PaX Team, Brad Spengler, Ingo Molnar, Andy Lutomirski,
	Tycho Andersen, Mark Rutland, Ard Biesheuvel, Borislav Petkov,
	Richard Sandiford, Thomas Gleixner, H . Peter Anvin,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

On Mon, Mar 5, 2018 at 1:21 PM, Alexander Popov <alex.popov@linux.com> wrote:
> On 05.03.2018 23:25, Peter Zijlstra wrote:
>> On Mon, Mar 05, 2018 at 11:43:19AM -0800, Laura Abbott wrote:
>>> On 03/05/2018 08:41 AM, Dave Hansen wrote:
>>>> On 03/03/2018 12:00 PM, Alexander Popov wrote:
>>>>>   Documentation/x86/x86_64/mm.txt  |   2 +
>>>>>   arch/Kconfig                     |  27 ++++++++++
>>>>>   arch/x86/Kconfig                 |   1 +
>>>>>   arch/x86/entry/entry_32.S        |  88 +++++++++++++++++++++++++++++++
>>>>>   arch/x86/entry/entry_64.S        | 108 +++++++++++++++++++++++++++++++++++++++
>>>>>   arch/x86/entry/entry_64_compat.S |  11 ++++
>>>>
>>>> This is a *lot* of assembly.  I wonder if you tried at all to get more
>>>> of this into C or whether you just inherited the assembly from the
>>>> original code?
>>>>
>>>
>>> This came up previously http://www.openwall.com/lists/kernel-hardening/2017/10/23/5
>>> there were concerns about trusting C to do the right thing as well as
>>> speed.
>>
>> And therefore the answer to this obvious question should've been part of
>> the Changelog :-)
>>
>> Dave is last in a long line of people asking this same question.
>
> Yes, actually the changelog in the cover letter contains that:
>
>   After some experiments, kept the asm implementation of erase_kstack(),
>   because it gives a full control over the stack for clearing it neatly
>   and doesn't offend KASAN.
>
> Moreover, later erase_kstack() on x86_64 became different from one on x86_32.

Maybe explicitly mention the C experiments in future change log?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-05 21:02         ` Kees Cook
@ 2018-03-05 21:40           ` Linus Torvalds
  2018-03-05 22:07             ` Linus Torvalds
  2018-03-06  0:56             ` Kees Cook
  2018-03-06  8:08           ` Ingo Molnar
  1 sibling, 2 replies; 59+ messages in thread
From: Linus Torvalds @ 2018-03-05 21:40 UTC (permalink / raw)
  To: Kees Cook
  Cc: Dave Hansen, Alexander Popov, Kernel Hardening, PaX Team,
	Brad Spengler, Ingo Molnar, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Ard Biesheuvel, Borislav Petkov,
	Richard Sandiford, Thomas Gleixner, H . Peter Anvin,
	Peter Zijlstra, Dmitry V . Levin, Emese Revfy, Jonathan Corbet,
	Andrey Ryabinin, Kirill A . Shutemov, Thomas Garnier,
	Andrew Morton, Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

On Mon, Mar 5, 2018 at 1:02 PM, Kees Cook <keescook@chromium.org> wrote:
> This specific class of flaw ("uninitialized" stack contents being used
> or leaked) is already being looked for by tons of tools like KASan.
> There are teams of people working on it, and they still haven't found
> all the cases where these flaws appear.

Right.

So let's do the *smart* thing, not the *stupid* thing.

This "mindlessly clear stack after use" is stupid.

There are smart things we can do, and it's not just about "find the
problems" like KASAN, but also "avoid undefined behavior".

I absolutely detest undefined compiler behavior. We should fix it. One
of the biggest mistakes C ever did was to have "undefined" in the
standard, and we already obviously limit that kind of broken behavior
with -fwrapv and -fno-strict-alias.

Both of those disable what I consider to be just bugs in the C
standard that should not be allowed for system software - it's a class
of "stupid compiler tricks to generate bad code", which by definition
a compiler shouldn't do. A compiler should generate *good* code, not
random bad code.

And yes:

> We've already seen multiple cases of this just with the by-ref
> initialization plugin, where a stack content leak goes away because we
> asked the compiler to please initialize the memory for us when we
> forgot to do it ourselves. Getting the compiler to help us seems like
> the obviously correct thing to do, since we're using such a
> memory-safety-unfriendly language. :)

This is more of the *smart* kind of behavior - I'm also perfectly
willing to say that automatic variables should just always initialize
to zero, exactly the same way static variables do.

And it doesn't necessarily generate any worse code.

Why? Because it's the _intelligent_ thing to do - unlike some
completely mindless idiotic "clear the stack" patch.

Now, I haven't actually seen the patches, I've only seen signs of
them, but the signs I have seen very much seem to say "this is the
mindless and *stupid* kind of crap that we should not do".

Things like adding hundreds of lines of new asm code, just because.

So by all means, push the zero initializations of automatic variables.
That's just good sense.

But don't push this crap.

              Linus

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-05 21:40           ` Linus Torvalds
@ 2018-03-05 22:07             ` Linus Torvalds
  2018-03-06  0:56             ` Kees Cook
  1 sibling, 0 replies; 59+ messages in thread
From: Linus Torvalds @ 2018-03-05 22:07 UTC (permalink / raw)
  To: Kees Cook
  Cc: Dave Hansen, Alexander Popov, Kernel Hardening, PaX Team,
	Brad Spengler, Ingo Molnar, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Ard Biesheuvel, Borislav Petkov,
	Richard Sandiford, Thomas Gleixner, H . Peter Anvin,
	Peter Zijlstra, Dmitry V . Levin, Emese Revfy, Jonathan Corbet,
	Andrey Ryabinin, Kirill A . Shutemov, Thomas Garnier,
	Andrew Morton, Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

On Mon, Mar 5, 2018 at 1:40 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> This is more of the *smart* kind of behavior - I'm also perfectly
> willing to say that automatic variables should just always initialize
> to zero, exactly the same way static variables do.

.. and perhaps even a build flag or plugin to make sure that the
compiler doesn't remove "dead" writes to those automatic variables
that got allocated on the stack?

Honestly, with clearing of automatic variables, what stack leaks
really exists in practice that this all would help against?

                          Linus

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-05 21:40           ` Linus Torvalds
  2018-03-05 22:07             ` Linus Torvalds
@ 2018-03-06  0:56             ` Kees Cook
  2018-03-06  4:30               ` Linus Torvalds
  2018-03-06  7:56               ` [OLD PATCH] net: recvmsg: Unconditionally zero struct sockaddr_storage " Ingo Molnar
  1 sibling, 2 replies; 59+ messages in thread
From: Kees Cook @ 2018-03-06  0:56 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Dave Hansen, Alexander Popov, Kernel Hardening, PaX Team,
	Brad Spengler, Ingo Molnar, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Ard Biesheuvel, Borislav Petkov,
	Richard Sandiford, Thomas Gleixner, H . Peter Anvin,
	Peter Zijlstra, Dmitry V . Levin, Emese Revfy, Jonathan Corbet,
	Andrey Ryabinin, Kirill A . Shutemov, Thomas Garnier,
	Andrew Morton, Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

On Mon, Mar 5, 2018 at 1:40 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> This "mindlessly clear stack after use" is stupid.

In defense of the series, it's hardly "mindless". :) The primary
feature is that it has run-time tracking of stack depth to clear only
the minimum needed portion of the stack.

> There are smart things we can do, and it's not just about "find the
> problems" like KASAN, but also "avoid undefined behavior".
>
> I absolutely detest undefined compiler behavior. We should fix it. One
> of the biggest mistakes C ever did was to have "undefined" in the
> standard, and we already obviously limit that kind of broken behavior
> with -fwrapv and -fno-strict-alias.

And -fno-delete-null-pointer-checks, and and and.... :P

What we've done, traditionally, has always been two pronged: fix the
kernel source and fix the compiler. Our kernel fixes have been short
term (fixing specific instances where we notice a problem), with the
compiler fix becoming the global solution down the road once everyone
has that version of the compiler. This leaves a defense gap for bugs
we haven't found yet (which are actually present, whether _we_ know
about them or not :P).

The recent discussions on minimum compiler version underscore the fact
that people move forward on compilers _very_ slowly. I've been trying
to add a third prong (with many of these kinds of defenses), where we
can address the gap. The first two prongs remain: fix the specific
cases as they're uncovered (e.g. by KASan), and fix the global problem
with the compiler (I recently detailed[1] four specific features I
wanted to see from compilers on this front last week). Then the added
third prong is: provide wide coverage _now_ for those that don't have
a fixed compiler (especially when no such fix even exists right now)
to catch all the cases we haven't found yet.

> This is more of the *smart* kind of behavior - I'm also perfectly
> willing to say that automatic variables should just always initialize
> to zero, exactly the same way static variables do.
>
> And it doesn't necessarily generate any worse code.

I agree, though some performance-sensitive subsystem (e.g. networking)
get very defensive about an always-on stack initialization[2]. No
matter what happens with this kind of automatic initialization, I
suspect it's going to have to stay a build-time option to let some
people opt-out of it.

> Honestly, with clearing of automatic variables, what stack leaks
> really exists in practice that this all would help against?

As we both know, we have very different ideas about what "in practice"
means for security flaws. :) So, yes, while auto-init gets us coverage
for a large portion of stack content leak bugs, it's still temporally
different from clearing the stack on exit. For example, a stack read
flaw with a negative index can read out the prior syscall's deeper
stack contents. Stack-clearing also reduces the lifetime of stack
contents (e.g. in the case of cross-thread reads from another process,
the time for the race to read the stack is longer). While these are
certainly more rare cases, they do exist, and I've seen much weirder
attacks.

Another case is that this series provides actual stack probing to
detect VLA abuse. This is less of an issue now with VMAP_STACK, and
I've had VLA removal on the long-term goal list for the kernel for a
while now, but the probing does work...

I would love to see (and am already pursuing) auto-init (see [1]
again), but this series does provide additional coverage, and it does
it today.

-Kees

[1] http://www.openwall.com/lists/kernel-hardening/2018/02/27/41
[2] Both these cases, and so many more, are solved with the byref
initialization plugin, but have been NAKed by -net:
     https://lkml.org/lkml/2013/4/9/641
     https://lkml.org/lkml/2017/10/31/699

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06  0:56             ` Kees Cook
@ 2018-03-06  4:30               ` Linus Torvalds
  2018-03-06 17:58                 ` Andy Lutomirski
  2018-03-06  7:56               ` [OLD PATCH] net: recvmsg: Unconditionally zero struct sockaddr_storage " Ingo Molnar
  1 sibling, 1 reply; 59+ messages in thread
From: Linus Torvalds @ 2018-03-06  4:30 UTC (permalink / raw)
  To: Kees Cook
  Cc: Dave Hansen, Alexander Popov, Kernel Hardening, PaX Team,
	Brad Spengler, Ingo Molnar, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Ard Biesheuvel, Borislav Petkov,
	Richard Sandiford, Thomas Gleixner, H . Peter Anvin,
	Peter Zijlstra, Dmitry V . Levin, Emese Revfy, Jonathan Corbet,
	Andrey Ryabinin, Kirill A . Shutemov, Thomas Garnier,
	Andrew Morton, Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

On Mon, Mar 5, 2018 at 4:56 PM, Kees Cook <keescook@chromium.org> wrote:
> On Mon, Mar 5, 2018 at 1:40 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>  and we already obviously limit that kind of broken behavior
>> with -fwrapv and -fno-strict-alias.
>
> And -fno-delete-null-pointer-checks, and and and.... :P

Indeed.

> The recent discussions on minimum compiler version underscore the fact
> that people move forward on compilers _very_ slowly.

Yes.

At the same time, things like that are kind of like a config option
for hardening - it's not like everybody absolutely needs to have the
compiler support, but people who want it can.

Sure, it limits us in some ways (ie we can't just say "we _depend_ on
automatic variables being zero"), but as long as it's not a huge
maintenance burden, it probably doesn't much matter.


>> And it doesn't necessarily generate any worse code.
>
> I agree, though some performance-sensitive subsystem (e.g. networking)
> get very defensive about an always-on stack initialization[2]. No
> matter what happens with this kind of automatic initialization, I
> suspect it's going to have to stay a build-time option to let some
> people opt-out of it.

I do think that having some way of marking "this variable really
doesn't need zeroing" would be fine.

Then peope'd *think* about the fact that you're passing an actual
uninitialized piece of memory around, and people could be careful with
them.

And the places that would actually want that should show up like a
sore thumb in a profile. And if they don't, then clearly the zeroing
can't be much of an issue, can it?

> Another case is that this series provides actual stack probing to
> detect VLA abuse. This is less of an issue now with VMAP_STACK, and
> I've had VLA removal on the long-term goal list for the kernel for a
> while now, but the probing does work...

I was actually hoping that the clang people would get rid of those,
but they only seemed to care about VLA's in structures, not about them
in general ;(

I detest VLA's, we really shouldn't use them. I'm sorry we have any.

              Linus

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [OLD PATCH] net: recvmsg: Unconditionally zero struct sockaddr_storage Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06  0:56             ` Kees Cook
  2018-03-06  4:30               ` Linus Torvalds
@ 2018-03-06  7:56               ` Ingo Molnar
  1 sibling, 0 replies; 59+ messages in thread
From: Ingo Molnar @ 2018-03-06  7:56 UTC (permalink / raw)
  To: Kees Cook, David S. Miller, Eric Dumazet
  Cc: Linus Torvalds, Dave Hansen, Alexander Popov, Kernel Hardening,
	PaX Team, Brad Spengler, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Ard Biesheuvel, Borislav Petkov,
	Richard Sandiford, Thomas Gleixner, H . Peter Anvin,
	Peter Zijlstra, Dmitry V . Levin, Emese Revfy, Jonathan Corbet,
	Andrey Ryabinin, Kirill A . Shutemov, Thomas Garnier,
	Andrew Morton, Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML


* Kees Cook <keescook@chromium.org> wrote:

> In defense of the series, it's hardly "mindless". :) The primary
> feature is that it has run-time tracking of stack depth to clear only
> the minimum needed portion of the stack.

The stack-tracer has been able to do that for years, right?

> > And it doesn't necessarily generate any worse code.
> 
> I agree, though some performance-sensitive subsystem (e.g. networking)
> get very defensive about an always-on stack initialization[2].

> [2] Both these cases, and so many more, are solved with the byref
> initialization plugin, but have been NAKed by -net:
>      https://lkml.org/lkml/2013/4/9/641
>      https://lkml.org/lkml/2017/10/31/699

> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -2188,6 +2188,7 @@ static int ___sys_recvmsg(struct socket *sock, struct user_msghdr __user *msg,
> 	struct sockaddr __user *uaddr;
> 	int __user *uaddr_len = COMPAT_NAMELEN(msg);
> 
> +	memset(&addr, 0, sizeof(addr));
> 	msg_sys->msg_name = &addr;
> 
> 	if (MSG_CMSG_COMPAT & flags)

In defense of DaveM and Eric, that networking patch is just pure, utter garbage 
and I'll NACK it just as much: it adds an unconditional 128-byte memset() to a hot 
path!!

  NACKed-by: Ingo Molnar <mingo@kernel.org>

The changelog is also infuriatingly misleading:

> Some protocols do not correctly wipe the contents of the on-stack
> struct sockaddr_storage sent down into recvmsg() (e.g. SCTP), and leak
> kernel stack contents to userspace. This wipes it unconditionally before
> per-protocol handlers run.
>
> Note that leaks like this are mitigated by building with
> CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL=y

It's just a scary, passive-aggressive lie about "some protocols" and ignores the 
desired case where all protocol handlers are correctly implemented.

Did you *really* expect this patch to be applied? The on-stack struct clearing GCC 
plugins (CONFIG_GCC_PLUGIN_STRUCTLEAK*=y) are a nice feature which fix a bad 
oversight in the C standard, but this particular unconditional memset() patch is 
just garbage.

Also, this characterization of the patch review process of the networking 
subsystem:

> though some performance-sensitive subsystem (e.g. networking)
> get very defensive about an always-on stack initialization[2].

... is thus very unfair as well: they didn't NAK or resist the 
CONFIG_GCC_PLUGIN_STRUCTLEAK* compiler feature at all, they NAK-ed a poorly 
thought out, unconditional memset() which adds unconditional overhead, which is 
their job to NAK!

Please refrain from using such unfair pressure tactics to get harebrained security 
patches upstream...

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-05 21:02         ` Kees Cook
  2018-03-05 21:40           ` Linus Torvalds
@ 2018-03-06  8:08           ` Ingo Molnar
  2018-03-06 15:16             ` Daniel Micay
  1 sibling, 1 reply; 59+ messages in thread
From: Ingo Molnar @ 2018-03-06  8:08 UTC (permalink / raw)
  To: Kees Cook
  Cc: Linus Torvalds, Dave Hansen, Alexander Popov, Kernel Hardening,
	PaX Team, Brad Spengler, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Ard Biesheuvel, Borislav Petkov,
	Richard Sandiford, Thomas Gleixner, H . Peter Anvin,
	Peter Zijlstra, Dmitry V . Levin, Emese Revfy, Jonathan Corbet,
	Andrey Ryabinin, Kirill A . Shutemov, Thomas Garnier,
	Andrew Morton, Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML


* Kees Cook <keescook@chromium.org> wrote:

> [...] We've already seen multiple cases of this just with the by-ref 
> initialization plugin, where a stack content leak goes away because we asked the 
> compiler to please initialize the memory for us when we forgot to do it 
> ourselves. Getting the compiler to help us seems like the obviously correct 
> thing to do, since we're using such a memory-safety-unfriendly language. :)

So the question is, are there any known classes of stack content leak that the 
following .config options:

  CONFIG_GCC_PLUGIN_STRUCTLEAK=y
  CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL=y

... do not handle, and for which leaks the best solution is such runtime stack 
clearing?

Because the GCC plugins clearing automatic local variables are _way_ superior:

- it's probably a heck of a lot cheaper to clear structs than it is to clear the 
  stack in every system call...

- it's probably also safer statistically, because if there _is_ a reference to an 
  uninitialized piece of memory in the kernel it will be to zero, not to a user 
  controlled value...

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06  8:08           ` Ingo Molnar
@ 2018-03-06 15:16             ` Daniel Micay
  2018-03-06 15:28               ` Daniel Micay
  2018-03-06 18:56               ` Linus Torvalds
  0 siblings, 2 replies; 59+ messages in thread
From: Daniel Micay @ 2018-03-06 15:16 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Linus Torvalds, Dave Hansen, Alexander Popov,
	Kernel Hardening, PaX Team, Brad Spengler, Andy Lutomirski,
	Tycho Andersen, Laura Abbott, Mark Rutland, Ard Biesheuvel,
	Borislav Petkov, Richard Sandiford, Thomas Gleixner,
	H . Peter Anvin, Peter Zijlstra, Dmitry V . Levin, Emese Revfy,
	Jonathan Corbet, Andrey Ryabinin, Kirill A . Shutemov,
	Thomas Garnier, Andrew Morton, Alexei Starovoitov, Josef Bacik,
	Masami Hiramatsu, Nicholas Piggin, Al Viro, David S . Miller,
	Ding Tianhong, David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

There are cases that clearing the stack can cover but I don't think it
has high importance if all locals were being default initialized to
zero. The new stack will end up with gaps from alignment, variables
that weren't fully used, etc. where data from an old stack is present.
Unless there's something really important left over like a private
key, it shouldn't make issues like a before read overflow on the stack
much worse.

There's an important missing feature which does get covered by this
plugin: variable-length arrays can skip past the VMAP_STACK guard
pages (as could alloca / large frames if they were actually used). The
ideal would probably be both GCC and Clang providing a working, safe
implementation of stack probes to enable with VMAP_STACK but they
don't do that. I'm not sure it makes sense for the kernel to be
working around that downstream though. It's strange because Clang has
had a working implementation on Windows for ages where the ABI
requires it but for some reason they never provided the option
elsewhere.


Zeroing locals isn't very expensive because the compiler has a chance
to optimize it out when it can figure out it isn't needed and they're
almost always used. If there are cases it hurts, it might be possible
to fix that by inlining an initialization function so it can see the
initialization. From my tests doing it in the entirety of the kernel
and Android userspace via a Clang patch (which we use in production),
SSP has a much more significant cost. Filling them with a non-zero
value is also a nice way to find bugs and I think that will still be
true with KMSan available via Clang. There might be cases where
there's a large array where the compiler can't remove the zeroing and
the call is pretty hot... but we didn't run into any substantial
performance costs, although for our niche we only really care about UI
latency, battery life, etc. and losing 1-2% throughput for some things
is irrelevant.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 15:16             ` Daniel Micay
@ 2018-03-06 15:28               ` Daniel Micay
  2018-03-06 18:56               ` Linus Torvalds
  1 sibling, 0 replies; 59+ messages in thread
From: Daniel Micay @ 2018-03-06 15:28 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Kees Cook, Linus Torvalds, Dave Hansen, Alexander Popov,
	Kernel Hardening, PaX Team, Brad Spengler, Andy Lutomirski,
	Tycho Andersen, Laura Abbott, Mark Rutland, Ard Biesheuvel,
	Borislav Petkov, Richard Sandiford, Thomas Gleixner,
	H . Peter Anvin, Peter Zijlstra, Dmitry V . Levin, Emese Revfy,
	Jonathan Corbet, Andrey Ryabinin, Kirill A . Shutemov,
	Thomas Garnier, Andrew Morton, Alexei Starovoitov, Josef Bacik,
	Masami Hiramatsu, Nicholas Piggin, Al Viro, David S . Miller,
	Ding Tianhong, David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

The kernel could use -Werror=vla-larger-than=2048 as a stop-gap and
work around all the errors by making sure it can't be larger and
reporting an error or adding `BUG_ON(size > limit)` to prove it to the
compiler if necessary. It warns if the VLA *can be* larger than the
limit, unlike the frame size warning the kernel uses only warning when
the frame is *guaranteed* to be larger than the limit. It wouldn't be
a guaranteed to work solution since a function can have other VLAs,
etc. but in practice it could probably be good enough. I've tried it
and I don't think it would be that much work. It might save someone
from a vulnerability that isn't using VMAP_STACK even if the issue
gets fixed for that.

Android kernels are likely mostly going to be compiled with Clang now
for 4.4+ and it doesn't have a similar warning though. I mentioned
last year that there were some real vulnerabilities caused by
unbounded VLAs:
http://openwall.com/lists/kernel-hardening/2017/05/12/33. It's fairly
easy to flip up and accidentally use one especially for people that
work with other languages. Maybe the kernel should already be using
-Werror=vla and overriding it for the few cases that exist to stop
letting people introduce more.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06  4:30               ` Linus Torvalds
@ 2018-03-06 17:58                 ` Andy Lutomirski
  0 siblings, 0 replies; 59+ messages in thread
From: Andy Lutomirski @ 2018-03-06 17:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kees Cook, Dave Hansen, Alexander Popov, Kernel Hardening,
	PaX Team, Brad Spengler, Ingo Molnar, Andy Lutomirski,
	Tycho Andersen, Laura Abbott, Mark Rutland, Ard Biesheuvel,
	Borislav Petkov, Richard Sandiford, Thomas Gleixner,
	H . Peter Anvin, Peter Zijlstra, Dmitry V . Levin, Emese Revfy,
	Jonathan Corbet, Andrey Ryabinin, Kirill A . Shutemov,
	Thomas Garnier, Andrew Morton, Alexei Starovoitov, Josef Bacik,
	Masami Hiramatsu, Nicholas Piggin, Al Viro, David S . Miller,
	Ding Tianhong, David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

On Tue, Mar 6, 2018 at 4:30 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Mon, Mar 5, 2018 at 4:56 PM, Kees Cook <keescook@chromium.org> wrote:
>> On Mon, Mar 5, 2018 at 1:40 PM, Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>
> I was actually hoping that the clang people would get rid of those,
> but they only seemed to care about VLA's in structures, not about them
> in general ;(
>
> I detest VLA's, we really shouldn't use them. I'm sorry we have any.

Then we need to fix our abysmal synchronous crypto API, which, AFAICT,
is the primary user of on-stack VLAs in the kernel.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 15:16             ` Daniel Micay
  2018-03-06 15:28               ` Daniel Micay
@ 2018-03-06 18:56               ` Linus Torvalds
  2018-03-06 19:07                 ` Peter Zijlstra
  2018-03-06 19:07                 ` Ard Biesheuvel
  1 sibling, 2 replies; 59+ messages in thread
From: Linus Torvalds @ 2018-03-06 18:56 UTC (permalink / raw)
  To: Daniel Micay
  Cc: Ingo Molnar, Kees Cook, Dave Hansen, Alexander Popov,
	Kernel Hardening, PaX Team, Brad Spengler, Andy Lutomirski,
	Tycho Andersen, Laura Abbott, Mark Rutland, Ard Biesheuvel,
	Borislav Petkov, Richard Sandiford, Thomas Gleixner,
	H . Peter Anvin, Peter Zijlstra, Dmitry V . Levin, Emese Revfy,
	Jonathan Corbet, Andrey Ryabinin, Kirill A . Shutemov,
	Thomas Garnier, Andrew Morton, Alexei Starovoitov, Josef Bacik,
	Masami Hiramatsu, Nicholas Piggin, Al Viro, David S . Miller,
	Ding Tianhong, David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

On Tue, Mar 6, 2018 at 7:16 AM, Daniel Micay <danielmicay@gmail.com> wrote:
>
> Zeroing locals isn't very expensive because the compiler has a chance
> to optimize it out when it can figure out it isn't needed and they're
> almost always used.

I thought I saw Kees with some stats, and it wasn't even noticeable.
But maybe I mis-remember.

I definitely like the "zero local variables", and would actually
prefer something even stronger than GCC_PLUGIN_STRUCTLEAK_BYREF_ALL
(although I can't actually read gcc plugins, so I'm just going by
name).

It's not necessarily just struct, it can very much be arrays and
regular automatic variables too.

And yes, we might want to have some override switch for individual
variables ("I really know what I'm doing and I promise I fill it
myself, but the compiler doesn't understand me, look here:").

                Linus

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 18:56               ` Linus Torvalds
@ 2018-03-06 19:07                 ` Peter Zijlstra
  2018-03-06 19:07                 ` Ard Biesheuvel
  1 sibling, 0 replies; 59+ messages in thread
From: Peter Zijlstra @ 2018-03-06 19:07 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Daniel Micay, Ingo Molnar, Kees Cook, Dave Hansen,
	Alexander Popov, Kernel Hardening, PaX Team, Brad Spengler,
	Andy Lutomirski, Tycho Andersen, Laura Abbott, Mark Rutland,
	Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Dmitry V . Levin, Emese Revfy,
	Jonathan Corbet, Andrey Ryabinin, Kirill A . Shutemov,
	Thomas Garnier, Andrew Morton, Alexei Starovoitov, Josef Bacik,
	Masami Hiramatsu, Nicholas Piggin, Al Viro, David S . Miller,
	Ding Tianhong, David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

On Tue, Mar 06, 2018 at 10:56:58AM -0800, Linus Torvalds wrote:
> On Tue, Mar 6, 2018 at 7:16 AM, Daniel Micay <danielmicay@gmail.com> wrote:
> >
> > Zeroing locals isn't very expensive because the compiler has a chance
> > to optimize it out when it can figure out it isn't needed and they're
> > almost always used.
> 
> I thought I saw Kees with some stats, and it wasn't even noticeable.
> But maybe I mis-remember.
> 
> I definitely like the "zero local variables", and would actually
> prefer something even stronger than GCC_PLUGIN_STRUCTLEAK_BYREF_ALL
> (although I can't actually read gcc plugins, so I'm just going by
> name).
> 
> It's not necessarily just struct, it can very much be arrays and
> regular automatic variables too.
> 
> And yes, we might want to have some override switch for individual
> variables ("I really know what I'm doing and I promise I fill it
> myself, but the compiler doesn't understand me, look here:").

Right, I have one case in perf where we very carefully pass a partially
initialized structure around for performance raisins. So I would
definitely like to have that attribute that allows us to over-ride that
BYREF_ALL thing.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 18:56               ` Linus Torvalds
  2018-03-06 19:07                 ` Peter Zijlstra
@ 2018-03-06 19:07                 ` Ard Biesheuvel
  2018-03-06 19:16                   ` Linus Torvalds
  1 sibling, 1 reply; 59+ messages in thread
From: Ard Biesheuvel @ 2018-03-06 19:07 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Daniel Micay, Ingo Molnar, Kees Cook, Dave Hansen,
	Alexander Popov, Kernel Hardening, PaX Team, Brad Spengler,
	Andy Lutomirski, Tycho Andersen, Laura Abbott, Mark Rutland,
	Borislav Petkov, Richard Sandiford, Thomas Gleixner,
	H . Peter Anvin, Peter Zijlstra, Dmitry V . Levin, Emese Revfy,
	Jonathan Corbet, Andrey Ryabinin, Kirill A . Shutemov,
	Thomas Garnier, Andrew Morton, Alexei Starovoitov, Josef Bacik,
	Masami Hiramatsu, Nicholas Piggin, Al Viro, David S . Miller,
	Ding Tianhong, David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

On 6 March 2018 at 18:56, Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Tue, Mar 6, 2018 at 7:16 AM, Daniel Micay <danielmicay@gmail.com> wrote:
>>
>> Zeroing locals isn't very expensive because the compiler has a chance
>> to optimize it out when it can figure out it isn't needed and they're
>> almost always used.
>
> I thought I saw Kees with some stats, and it wasn't even noticeable.
> But maybe I mis-remember.
>
> I definitely like the "zero local variables", and would actually
> prefer something even stronger than GCC_PLUGIN_STRUCTLEAK_BYREF_ALL
> (although I can't actually read gcc plugins, so I'm just going by
> name).
>
> It's not necessarily just struct, it can very much be arrays and
> regular automatic variables too.
>

The compiler usually does a pretty good job of detecting which scalar
variables are never initialized by regular assignment. The reason that
this is different for struct type variables is that we often
initialize them by passing a reference to a function, and if this
function is in another compilation unit, the compiler doesn't know
whether such a call amounts to an initialization or not.

We could easily extend this to scalar and array types, but we'd first
need to see what the performance impact is, because I don't think it
will be negligible.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 19:07                 ` Ard Biesheuvel
@ 2018-03-06 19:16                   ` Linus Torvalds
  2018-03-06 20:42                     ` Arnd Bergmann
  0 siblings, 1 reply; 59+ messages in thread
From: Linus Torvalds @ 2018-03-06 19:16 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Daniel Micay, Ingo Molnar, Kees Cook, Dave Hansen,
	Alexander Popov, Kernel Hardening, PaX Team, Brad Spengler,
	Andy Lutomirski, Tycho Andersen, Laura Abbott, Mark Rutland,
	Borislav Petkov, Richard Sandiford, Thomas Gleixner,
	H . Peter Anvin, Peter Zijlstra, Dmitry V . Levin, Emese Revfy,
	Jonathan Corbet, Andrey Ryabinin, Kirill A . Shutemov,
	Thomas Garnier, Andrew Morton, Alexei Starovoitov, Josef Bacik,
	Masami Hiramatsu, Nicholas Piggin, Al Viro, David S . Miller,
	Ding Tianhong, David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML

On Tue, Mar 6, 2018 at 11:07 AM, Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
>
> The compiler usually does a pretty good job of detecting which scalar
> variables are never initialized by regular assignment.

Sure, but "usually" is not really the same as always. Sometimes scalar
types are initialized by passing a reference to them too.

> We could easily extend this to scalar and array types, but we'd first
> need to see what the performance impact is, because I don't think it
> will be negligible.

For scalar types, I suspect it will be entirely unnoticeable, because
they are not only small, but it's rare that this kind of "initialize
by passing a reference" happens in the first place.

For arrays, I agree. We very well may have arrays that we really want
to do magic things about. But even then I'd rather have a "don't
initialize this" flag for critical stuff that really *does* get
initialized some other way. Then we can grep for those things and be
more careful.

If somebody has big arrays on the stack, that's often a problem
anyway. It may be common in non-kernel code, but kernel code is very
special.

                    Linus

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 19:16                   ` Linus Torvalds
@ 2018-03-06 20:42                     ` Arnd Bergmann
  2018-03-06 21:01                       ` Linus Torvalds
  0 siblings, 1 reply; 59+ messages in thread
From: Arnd Bergmann @ 2018-03-06 20:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ard Biesheuvel, Daniel Micay, Ingo Molnar, Kees Cook,
	Dave Hansen, Alexander Popov, Kernel Hardening, PaX Team,
	Brad Spengler, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, X86 ML, LKML

On Tue, Mar 6, 2018 at 8:16 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Mar 6, 2018 at 11:07 AM, Ard Biesheuvel
> <ard.biesheuvel@linaro.org> wrote:
>>
>> The compiler usually does a pretty good job of detecting which scalar
>> variables are never initialized by regular assignment.
>
> Sure, but "usually" is not really the same as always. Sometimes scalar
> types are initialized by passing a reference to them too.
>
>> We could easily extend this to scalar and array types, but we'd first
>> need to see what the performance impact is, because I don't think it
>> will be negligible.
>
> For scalar types, I suspect it will be entirely unnoticeable, because
> they are not only small, but it's rare that this kind of "initialize
> by passing a reference" happens in the first place.

A lot of the scalar variables with actual bugs are missed by the gcc
warnings, because it never allocates a stack slot for examples
like

int f(int c)
{
        int i;
        if (c)
                return i; /* uninitialized return */
        asm volatile("" : "=r" (i)); /* gcc sees that 'i' escapes here */
        return 0;
}

int g(int c)
{
        int i;
        if (c)  /* gcc optimizes out the condition as nothing else sets i */
                i = 1;
        return i;
}

At -O2 optimization level, these fail to produce a warning, and
they won't ever leak stack data, but they are still undefined behavior
and don't do what the author intended.

Forcing gcc to allocate a stack slot and zero-initialize it should
find many bugs by adding valid warnings, but also add lots of
false positives as well as prevent important optimizations in other
places that are actually well-defined.

> For arrays, I agree. We very well may have arrays that we really want
> to do magic things about. But even then I'd rather have a "don't
> initialize this" flag for critical stuff that really *does* get
> initialized some other way. Then we can grep for those things and be
> more careful.
>
> If somebody has big arrays on the stack, that's often a problem
> anyway. It may be common in non-kernel code, but kernel code is very
> special.

I can think of a few cases that are important, this one is well-known:

int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
                           fd_set __user *exp, struct timespec64 *end_time)
{
        ....
        /* Allocate small arguments on the stack to save memory and be faster */
        long stack_fds[SELECT_STACK_ALLOC/sizeof(long)];

Another case I came across very recently with a similar optimization is:

 int ib_process_cq_direct(struct ib_cq *cq, int budget)
 {
       struct ib_wc wcs[IB_POLL_BATCH];

In both cases, the stack variables are chosen to be just under
the CONFIG_FRAME_WARN limit to avoid a memory allocation
in the fast path. If we add an explicit zero initialization,
that optimization may turn out counterproductive, but a
"don't initialize" flag would be sufficient to deal with them
one at a time.

There is also the really scary code like:

#define SKCIPHER_REQUEST_ON_STACK(name, tfm) \
        char __##name##_desc[sizeof(struct skcipher_request) + \
                crypto_skcipher_reqsize(tfm)] CRYPTO_MINALIGN_ATTR; \
        struct skcipher_request *name = (void *)__##name##_desc

that implements an alloca() through a dynamic array for storing
a variable-sized structure on the stack. These are usually small,
but the size is driver specific and some can be surprisingly
big, e.g. struct ccp_aes_req_ctx, struct hifn_request_context, or
struct iproc_reqctx_s. If we can come up with a way to avoid those,
we could actually enable -Wstack-usage=${CONFIG_FRAME_WARN}
to warn for any functions with dynamic stack allocation.

      Arnd

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 20:42                     ` Arnd Bergmann
@ 2018-03-06 21:01                       ` Linus Torvalds
  2018-03-06 21:21                         ` Arnd Bergmann
  2018-03-06 21:36                         ` Steven Rostedt
  0 siblings, 2 replies; 59+ messages in thread
From: Linus Torvalds @ 2018-03-06 21:01 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Ard Biesheuvel, Daniel Micay, Ingo Molnar, Kees Cook,
	Dave Hansen, Alexander Popov, Kernel Hardening, PaX Team,
	Brad Spengler, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, X86 ML, LKML

On Tue, Mar 6, 2018 at 12:42 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>
> Forcing gcc to allocate a stack slot and zero-initialize it should
> find many bugs by adding valid warnings, but also add lots of
> false positives as well as prevent important optimizations in other
> places that are actually well-defined.

Oh, no, the "force gcc to allocate a stack slot" would be absolutely insane.

You should never do that. Anybody who does that should be shot.

But you don't have to force any stack allocation. You should just
initialize it to zero (*without* the stack allocation).

Then the optimization passes will just remove the initialization in
99.9% of all cases. Only very occasionally - when gcc cannot see it
being overwritten - would it remain. And those are exactly the cases
where you *want* it to remain.

Of course, it is possible that you can't just do that from a plugin.
But I can almost guarantee that it would be trivial to do as a gcc
switch if any gcc people wanted to do it. Just look at each scalar
auto variable, and if it doesn't have an initializer, just add one to
zero completely mindlessly.

As the crusaders said: "Kill them all and let the optimizer sort them out".

(Note: the "trivial to do" is about scalar values only. Non-scalar
values have more complexities, like struct padding issues etc).

>> If somebody has big arrays on the stack, that's often a problem
>> anyway. It may be common in non-kernel code, but kernel code is very
>> special.
>
> I can think of a few cases that are important, this one is well-known:
>
> int core_sys_select(int n, fd_set __user *inp, fd_set __user *outp,
>                            fd_set __user *exp, struct timespec64 *end_time)
> {
>         ....
>         /* Allocate small arguments on the stack to save memory and be faster */
>         long stack_fds[SELECT_STACK_ALLOC/sizeof(long)];

Oh, I can well imagine things like this.

But they will be a handful, they will be seen in profiles, and they'd
be easy to mark. And then you can validate the ones you mark, instead
of worrying about all the ones you don't even know about.

> There is also the really scary code like:
>
> #define SKCIPHER_REQUEST_ON_STACK(name, tfm) \
>         char __##name##_desc[sizeof(struct skcipher_request) + \
>                 crypto_skcipher_reqsize(tfm)] CRYPTO_MINALIGN_ATTR; \
>         struct skcipher_request *name = (void *)__##name##_desc

The crypto stuff does disgusting things. It used to do VLA's in
structures too, that has morphed into using these crazy XYZ_ON_STACK()
defines.

They eventually need to fix their own crap at some point. It shouldn't
be something we care about from a design standpoint.

                 Linus

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 21:01                       ` Linus Torvalds
@ 2018-03-06 21:21                         ` Arnd Bergmann
  2018-03-06 21:29                           ` Linus Torvalds
  2018-03-06 21:36                         ` Steven Rostedt
  1 sibling, 1 reply; 59+ messages in thread
From: Arnd Bergmann @ 2018-03-06 21:21 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ard Biesheuvel, Daniel Micay, Ingo Molnar, Kees Cook,
	Dave Hansen, Alexander Popov, Kernel Hardening, PaX Team,
	Brad Spengler, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, X86 ML, LKML

On Tue, Mar 6, 2018 at 10:01 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Mar 6, 2018 at 12:42 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>>
>> Forcing gcc to allocate a stack slot and zero-initialize it should
>> find many bugs by adding valid warnings, but also add lots of
>> false positives as well as prevent important optimizations in other
>> places that are actually well-defined.
>
> Oh, no, the "force gcc to allocate a stack slot" would be absolutely insane.
>
> You should never do that. Anybody who does that should be shot.
>
> But you don't have to force any stack allocation. You should just
> initialize it to zero (*without* the stack allocation).
>
> Then the optimization passes will just remove the initialization in
> 99.9% of all cases. Only very occasionally - when gcc cannot see it
> being overwritten - would it remain. And those are exactly the cases
> where you *want* it to remain.

Right, there are two separate problems: the missing warnings
and the actual uninitialized use. Allocating the stack slots would
address both but only at an enormous cost. Assigning a value
would still have a cost, as it would prevent certain other optimizations,
and it wouldn't help find the missing initializations,
but the cost would obviously be much lower.

      Arnd

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 21:21                         ` Arnd Bergmann
@ 2018-03-06 21:29                           ` Linus Torvalds
  2018-03-06 22:09                             ` Arnd Bergmann
  0 siblings, 1 reply; 59+ messages in thread
From: Linus Torvalds @ 2018-03-06 21:29 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Ard Biesheuvel, Daniel Micay, Ingo Molnar, Kees Cook,
	Dave Hansen, Alexander Popov, Kernel Hardening, PaX Team,
	Brad Spengler, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, X86 ML, LKML

On Tue, Mar 6, 2018 at 1:21 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>
> Right, there are two separate problems: the missing warnings
> and the actual uninitialized use. Allocating the stack slots would
> address both but only at an enormous cost. Assigning a value
> would still have a cost, as it would prevent certain other optimizations,
> and it wouldn't help find the missing initializations,
> but the cost would obviously be much lower.

What possible optimization would it prevent?

I cannot think of a single interesting optimization that would be
invalidated by a trivial initialization of 0 to a scalar.

Yes, it can obviously generate some extra code for the "pass pointer
to uninitialized scalar around to be initialized", where it can now
make that stack slot be initialized to zero - so it can generate extra
code. But disabling an optimization?

What did you have in mind?

                   Linus

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 21:01                       ` Linus Torvalds
  2018-03-06 21:21                         ` Arnd Bergmann
@ 2018-03-06 21:36                         ` Steven Rostedt
  2018-03-06 21:41                           ` Linus Torvalds
  2018-03-06 21:47                           ` Arnd Bergmann
  1 sibling, 2 replies; 59+ messages in thread
From: Steven Rostedt @ 2018-03-06 21:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Arnd Bergmann, Ard Biesheuvel, Daniel Micay, Ingo Molnar,
	Kees Cook, Dave Hansen, Alexander Popov, Kernel Hardening,
	PaX Team, Brad Spengler, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Dominik Brodowski,
	Juergen Gross, Greg Kroah-Hartman, Dan Williams, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon, X86 ML,
	LKML

On Tue, 6 Mar 2018 13:01:20 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Then the optimization passes will just remove the initialization in
> 99.9% of all cases. Only very occasionally - when gcc cannot see it
> being overwritten - would it remain. And those are exactly the cases
> where you *want* it to remain.

You mean have gcc fill in the variables that it thinks is used
uninitialized with zeros? As long as it still warns about it, because
that usually catches some real bugs where zeroing the variable doesn't
actually fix the bug.

I also tried the example Arnd posted with:


int g(int c)
{
        int i;
        if (c)  /* gcc optimizes out the condition as nothing else sets i */
                i = 1;
        return i;
}

And he's right. -O2 doesn't warn :-(  I think that it should.

-- Steve

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 21:36                         ` Steven Rostedt
@ 2018-03-06 21:41                           ` Linus Torvalds
  2018-03-06 21:47                             ` Linus Torvalds
  2018-03-06 21:47                           ` Arnd Bergmann
  1 sibling, 1 reply; 59+ messages in thread
From: Linus Torvalds @ 2018-03-06 21:41 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Arnd Bergmann, Ard Biesheuvel, Daniel Micay, Ingo Molnar,
	Kees Cook, Dave Hansen, Alexander Popov, Kernel Hardening,
	PaX Team, Brad Spengler, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Dominik Brodowski,
	Juergen Gross, Greg Kroah-Hartman, Dan Williams, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon, X86 ML,
	LKML

On Tue, Mar 6, 2018 at 1:36 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> You mean have gcc fill in the variables that it thinks is used
> uninitialized with zeros?

No.

It should do it unconditionally.

Because "gcc thinks is used uninitialized" is simply not good enough.

The warning would remain for the case where you don't enable this
hardening feature, so it wouldn't go away.

The warning also wouldn't be *improved*. The warning is simply a
completely a separate thing, and as your example shows is *entirely*
independent of the "make all locals be initialized to zero".

                  Linus

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 21:41                           ` Linus Torvalds
@ 2018-03-06 21:47                             ` Linus Torvalds
  2018-03-06 22:29                               ` Steven Rostedt
  2018-03-12  8:22                               ` Ingo Molnar
  0 siblings, 2 replies; 59+ messages in thread
From: Linus Torvalds @ 2018-03-06 21:47 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Arnd Bergmann, Ard Biesheuvel, Daniel Micay, Ingo Molnar,
	Kees Cook, Dave Hansen, Alexander Popov, Kernel Hardening,
	PaX Team, Brad Spengler, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Dominik Brodowski,
	Juergen Gross, Greg Kroah-Hartman, Dan Williams, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon, X86 ML,
	LKML

On Tue, Mar 6, 2018 at 1:41 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> The warning would remain for the case where you don't enable this
> hardening feature, so it wouldn't go away.

Side note: if in ten years we'd have a minimum gcc version that we
could  just unconditionally say "auto (scalars) initialize to zero",
then we'd just make that be the *semantics*, and the warning would
obviously simply not ever be an issue.

But that's no different from "static variables initialize to zero" and
nobody sane expects a warning from that. It's just what it is.

But that's at least ten years away. We tend to support disgustingly
old compiler versions.

               Linus

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 21:36                         ` Steven Rostedt
  2018-03-06 21:41                           ` Linus Torvalds
@ 2018-03-06 21:47                           ` Arnd Bergmann
  2018-03-06 22:19                             ` Linus Torvalds
  1 sibling, 1 reply; 59+ messages in thread
From: Arnd Bergmann @ 2018-03-06 21:47 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Linus Torvalds, Ard Biesheuvel, Daniel Micay, Ingo Molnar,
	Kees Cook, Dave Hansen, Alexander Popov, Kernel Hardening,
	PaX Team, Brad Spengler, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Dominik Brodowski,
	Juergen Gross, Greg Kroah-Hartman, Dan Williams, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon, X86 ML,
	LKML

On Tue, Mar 6, 2018 at 10:36 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Tue, 6 Mar 2018 13:01:20 -0800
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>> Then the optimization passes will just remove the initialization in
>> 99.9% of all cases. Only very occasionally - when gcc cannot see it
>> being overwritten - would it remain. And those are exactly the cases
>> where you *want* it to remain.
>
> You mean have gcc fill in the variables that it thinks is used
> uninitialized with zeros? As long as it still warns about it, because
> that usually catches some real bugs where zeroing the variable doesn't
> actually fix the bug.
>
> I also tried the example Arnd posted with:
>
>
> int g(int c)
> {
>         int i;
>         if (c)  /* gcc optimizes out the condition as nothing else sets i */
>                 i = 1;
>         return i;
> }
>
> And he's right. -O2 doesn't warn :-(  I think that it should.

See this bug that has been open since 2004 and the 31 bugs
that have been marked as duplicates since then:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18501

I don't really understand it myself, but I do understand that
the gcc developers think this is a hard problem to solve.

      Arnd

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 21:29                           ` Linus Torvalds
@ 2018-03-06 22:09                             ` Arnd Bergmann
  2018-03-06 22:24                               ` Linus Torvalds
  0 siblings, 1 reply; 59+ messages in thread
From: Arnd Bergmann @ 2018-03-06 22:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ard Biesheuvel, Daniel Micay, Ingo Molnar, Kees Cook,
	Dave Hansen, Alexander Popov, Kernel Hardening, PaX Team,
	Brad Spengler, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, X86 ML, LKML

On Tue, Mar 6, 2018 at 10:29 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Tue, Mar 6, 2018 at 1:21 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>>
>> Right, there are two separate problems: the missing warnings
>> and the actual uninitialized use. Allocating the stack slots would
>> address both but only at an enormous cost. Assigning a value
>> would still have a cost, as it would prevent certain other optimizations,
>> and it wouldn't help find the missing initializations,
>> but the cost would obviously be much lower.
>
> What possible optimization would it prevent?
>
> I cannot think of a single interesting optimization that would be
> invalidated by a trivial initialization of 0 to a scalar.
>
> Yes, it can obviously generate some extra code for the "pass pointer
> to uninitialized scalar around to be initialized", where it can now
> make that stack slot be initialized to zero - so it can generate extra
> code. But disabling an optimization?
>
> What did you have in mind?

The type of optimization that causes my earlier example

int g(int c)
{
        int i;
        if (c)  /* gcc optimizes out the condition as nothing else sets i */
                i = 1;
        return i;
}

to be simplified to

int g(int c)
{
        return 1;
}

Obviously the specific example is a bug and we'd be better off
with the zero-initialization, but it was clearly an intentional
optimization. The bug report in [1]
suggests that the optimization step that causes the loss of
information here is "sparse conditional constant propagation"
as described in [2].

I have to admit that I understand nearly nothing of that
paper, but my memory from earlier discussions with
gcc folks is that it will merge code paths for uninitialized
variables with the normal case if there is exactly one
initialization, or more generally try to eliminate any code
path that leads to undefined behavior.
By forcing an initialization, I would imagine that this early
dead code elimination fails even for the case that a
later optimization step can prove that an uninitialized
variable use never occurs.

      Arnd

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18501
[2] http://www.cs.wustl.edu/~cytron/531Pages/f11/Resources/Papers/cprop.pdf

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 21:47                           ` Arnd Bergmann
@ 2018-03-06 22:19                             ` Linus Torvalds
  0 siblings, 0 replies; 59+ messages in thread
From: Linus Torvalds @ 2018-03-06 22:19 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Steven Rostedt, Ard Biesheuvel, Daniel Micay, Ingo Molnar,
	Kees Cook, Dave Hansen, Alexander Popov, Kernel Hardening,
	PaX Team, Brad Spengler, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Dominik Brodowski,
	Juergen Gross, Greg Kroah-Hartman, Dan Williams, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon, X86 ML,
	LKML

On Tue, Mar 6, 2018 at 1:47 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>
> I don't really understand it myself, but I do understand that
> the gcc developers think this is a hard problem to solve.

Depending on the internal representation, things that look absolutely
trivially obvious to humans may not actually be all that trivial at
all.

In particular, the only sane modern IR for a compiler is based on SSA
(single static assignment), and in that form, the example code is
actually very obvious: the variable 'i' is assigned to exactly once,
and has the value 1, so *OBVIOUSLY* the value of 'i' is 1 at each use.

It's really so obvious that sparse does exactly the same. If you use
the sparse "test-linearize" program, you will see that test program
result in

  g:
  .L0:
      <entry-point>
      ret.32      $1

because it is very obvious to the optimizer that 'i' is 1.

In fact, to get anything else, you almost _have_ to initialize 'i' to
"UNDEFINED" in the compiler. At that point the SSA representation says
'i' has two sources, one undefined and one '1', and all the SSA
optimizations and simplifications just DTRT automatically.

But it literally involves adding that initialization.

This is actually one reason I think the C "automatic variables are
undefined" behavior is wrong. It made sense historically, but in an
SSA world, it actually ends up not even helping.

You might as well just initialize the damn things to zero, avoiding an
undefined case and just be done with it, and make that part of the
language definition. It would just be consistent with static variables
anyway, so it would actually clean up the language conceptually too.

Of course, within the confines of C _history_, that didn't make sense.
SSA became a thing two decades after C had been invented, and it took
even longer for it to become the de-facto standard IR for any sane
optimization.

Honestly, that whole "local variable might be used uninitialized" is
just a historical accident, and is not some fundamentally good
warning. It's _only_ a valid warning due to a language mis-feature.

                     Linus

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 22:09                             ` Arnd Bergmann
@ 2018-03-06 22:24                               ` Linus Torvalds
  0 siblings, 0 replies; 59+ messages in thread
From: Linus Torvalds @ 2018-03-06 22:24 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Ard Biesheuvel, Daniel Micay, Ingo Molnar, Kees Cook,
	Dave Hansen, Alexander Popov, Kernel Hardening, PaX Team,
	Brad Spengler, Andy Lutomirski, Tycho Andersen, Laura Abbott,
	Mark Rutland, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, X86 ML, LKML

On Tue, Mar 6, 2018 at 2:09 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Tue, Mar 6, 2018 at 10:29 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> Yes, it can obviously generate some extra code for the "pass pointer
>> to uninitialized scalar around to be initialized", where it can now
>> make that stack slot be initialized to zero - so it can generate extra
>> code. But disabling an optimization?
>>
>> What did you have in mind?
>
> The type of optimization that causes my earlier example

That's not a  valid "optimization". It doesn't make correct code run
faster. It just makes incorrect code result in incorrect results.

See my previous email about _why_ it does it, but the point remains:
the "optimization" is not actually an optimization. It's just "garbage
in, garbage out". Speed doesn't matter for garbage.

In fact, see my explanation for _why_ that optimization happens to
understand why adding an iniitializer doesn't make it go slower, but
simply makes ie well-defined and thus makes the optimization go away.

Really, only stupid compiler people think it's an "optimization" to
take advantage of undefined behavior. Sure, garbage code generation
can run faster, but that's not "optimization".

We want to very much avoid those kinds of people, and those kinds of
optimizations. It's very much why we disable strict aliasing etc -
because the whole optimization is purely based on taking advantage of
undefined behavior.

It's shit. Don't call it anything else. Getting rid of that
optimization is a *good* thing, not a bad thing.

                 Linus

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 21:47                             ` Linus Torvalds
@ 2018-03-06 22:29                               ` Steven Rostedt
  2018-03-06 22:41                                 ` Linus Torvalds
  2018-03-12  8:22                               ` Ingo Molnar
  1 sibling, 1 reply; 59+ messages in thread
From: Steven Rostedt @ 2018-03-06 22:29 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Arnd Bergmann, Ard Biesheuvel, Daniel Micay, Ingo Molnar,
	Kees Cook, Dave Hansen, Alexander Popov, Kernel Hardening,
	PaX Team, Brad Spengler, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Dominik Brodowski,
	Juergen Gross, Greg Kroah-Hartman, Dan Williams, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon, X86 ML,
	LKML

On Tue, 6 Mar 2018 13:47:11 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> But that's no different from "static variables initialize to zero" and
> nobody sane expects a warning from that. It's just what it is.

Yeah but I would argue that static variables are different. They are
either used in multiple functions or are used for the same function
that maintains its value for multiple calls to that function. Either
way the semantics of a static variable is that it has to be initialized
to something, because you don't know when the first one sets it.

Local variables on the other hand are only in scope of one logical
function algorithm. I've done lots of stupid errors where I may
initialize a local variable in a loop and forget that the loop may
never execute. Sometimes the original value shouldn't be zero, although
most times it is. But that warning has saved me, especially when the
value isn't supposed to be zero.

-- Steve

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 22:29                               ` Steven Rostedt
@ 2018-03-06 22:41                                 ` Linus Torvalds
  2018-03-06 22:52                                   ` Steven Rostedt
  0 siblings, 1 reply; 59+ messages in thread
From: Linus Torvalds @ 2018-03-06 22:41 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Arnd Bergmann, Ard Biesheuvel, Daniel Micay, Ingo Molnar,
	Kees Cook, Dave Hansen, Alexander Popov, Kernel Hardening,
	PaX Team, Brad Spengler, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Dominik Brodowski,
	Juergen Gross, Greg Kroah-Hartman, Dan Williams, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon, X86 ML,
	LKML

On Tue, Mar 6, 2018 at 2:29 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> Yeah but I would argue that static variables are different.

Not really.

You're basically saying "I'd like my language to have intrinsic
undefined behavior, so that when I hit a bug that might trigger that
undefined behavior I get a warning":.

Do such bugs happen? Sure. But _other_ bugs happen all the time, and
the ONLY real reason you are so adamant about _that_ particular
warning, is that "undefined behavior" is such a serious problem.

Think about that for a second.

Make the serious problem go away, and the warning MAKES NO SENSE.

Do you want a compiler to warn you when you write code like this:

    double area(double radius)
    {
         return radius*2*pi;
    }

just because it's obviously buggy shit?

And do you *really* expect a compiler warning for that kind of
obviously buggy shit?

I bet you don't.

And if you don't, why do you expect a compiler warning for

  int g(int c)
  {
         int i = 0;
         if (c)
                 i = 1;
         return i;
  }

which is *literally* what your example would have been had just C been
specified to avoid unnecessary undefined behavior?

Seriously. Look at that example, and tell me that gcc should warn
about it. I can imagine the warning ("warning: function 'g()' is
stupidly written, just use !!c").

But nobody sane really would expect a compiler to warn about it. Once
a compiler is that smart, you wouldn't write code for it, you'd just
ask it to generate code from a description.

Guys, everybody agrees that C isn't a safe language.

Do you think that lack of safety is a _good_ thing?

Do you realize that most of the lack of safety is almost directly
about flexibility, simplicity, and good code generation?

But what if I told you that some of the lack of safety doesn't
actually add to flexibility, simplicity, _or_ good code generation?
Wouldn't you say "we don't want it to be unsafe" then?

I'm literally telling you that lack of variable initialization is
almost purely a bad thing. C would be a safer language, with less
undefined behavior, if it just made the initialization of automatic
variables be something you cannot avoid.

For scalars in particular, there is basically no downside. Aggregate
types really are much less black-and-white.

                 Linus

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 22:41                                 ` Linus Torvalds
@ 2018-03-06 22:52                                   ` Steven Rostedt
  2018-03-06 23:09                                     ` Linus Torvalds
  0 siblings, 1 reply; 59+ messages in thread
From: Steven Rostedt @ 2018-03-06 22:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Arnd Bergmann, Ard Biesheuvel, Daniel Micay, Ingo Molnar,
	Kees Cook, Dave Hansen, Alexander Popov, Kernel Hardening,
	PaX Team, Brad Spengler, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Dominik Brodowski,
	Juergen Gross, Greg Kroah-Hartman, Dan Williams, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon, X86 ML,
	LKML

On Tue, 6 Mar 2018 14:41:55 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> I'm literally telling you that lack of variable initialization is
> almost purely a bad thing. C would be a safer language, with less
> undefined behavior, if it just made the initialization of automatic
> variables be something you cannot avoid.

I get your point. Basically you are saying that the language should
have forced all local variables to be initialized to zero in the spec,
and the compiler is free to optimize that initialization out if the
variable is always initialized first.

Just like it would optimize the below.

int g(int c)
{
	int i = 0;

	if (c < 10)
		i = 1;
	else
		i = 2;
	return i;
};

There's no reason to initialize i to zero because it will never return
zero. But what you are saying is to make that implicit by just
declaring 'i'.

Other languages do this, and you are right, I don't expect warnings to
appear in  those. I guess I've grown so accustomed to this "undefined
behavior" it's part of what I just expect.

-- Steve

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 22:52                                   ` Steven Rostedt
@ 2018-03-06 23:09                                     ` Linus Torvalds
  0 siblings, 0 replies; 59+ messages in thread
From: Linus Torvalds @ 2018-03-06 23:09 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Arnd Bergmann, Ard Biesheuvel, Daniel Micay, Ingo Molnar,
	Kees Cook, Dave Hansen, Alexander Popov, Kernel Hardening,
	PaX Team, Brad Spengler, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Dominik Brodowski,
	Juergen Gross, Greg Kroah-Hartman, Dan Williams, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon, X86 ML,
	LKML

On Tue, Mar 6, 2018 at 2:52 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> I get your point. Basically you are saying that the language should
> have forced all local variables to be initialized to zero in the spec,
> and the compiler is free to optimize that initialization out if the
> variable is always initialized first.

Exactly.

> Just like it would optimize the below.
>
> int g(int c)
> {
>         int i = 0;
>
>         if (c < 10)
>                 i = 1;
>         else
>                 i = 2;
>         return i;
> };
>
> There's no reason to initialize i to zero because it will never return
> zero. But what you are saying is to make that implicit by just
> declaring 'i'.

Yes.

And at the same time, it is certainly true that a compiler cannot
*always* remove the unnecessary initialization.

But with any half-way modern compiler, all the _normal_ cases would be
no-brainers, and the case where the compiler couldn't do so really is
code that almost certainly wants that initialization anyway, because a
_human_ generally wouldn't see that it's initialized either.

The obvious counter-example is literally

> int g(int c)
> {
>         int i = 0;
>
>         initialize_variable(&i);
>
>         if (c < 10)

where a *human* can tell that "hey, that initialization to zero is
pointless, because 'i' is clearly being initialized by that
'initialize_variable()' function call".

But the compiler often cannot see that "obvious" initialization, and
would have to spend the extra instruction to actually do that zero
initialization.

So I'm not saying that it would always be entirely free to just have
the safe default.

But in the context of the kernel, I think that we can probably agree
that "oops, we've had exactly the kind of bug where a function
_didn't_ end up initializing the variable every time even though to a
human obviously thought it did" actually has happened.

Now, as mentioned, aggregate types are different.

They are _less_ different for the kernel than they are for user
programs (because we have to be careful about aggregate types on the
stack anyway just for stack size reasons), but they do have more
issues with the implicit initializers.

For aggregate types, initializers are obviously more expensive to
begin with, and the "initialize by passing a reference to an external
function" is much more common too. So there's a double hit.

At the same time, aggregate types are obviously where we tend to have
most problems, and I really think we should strive to avoid even
medium-sized aggregate types in the kernel anyway.

Which is why I'm much more willing to take a hit on those kinds of
things (that I think should be rare), than add things like "let's just
clear the stack after every system call" kinds of things.

But I do agree that for aggregate types, we almost certainly do want
to have some way to flag "this is explicitly not initialized".

Instead, right now, we're in the reverse situation, where we have to
add explicit initializers.

I think we'd be in a better situation if the *default* semantics was
the safe and well-defined behavior, and we'd have to do extra things
to get the undefined uninitialized behavior for when we know better,
and we really really care.

              Linus

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-06 21:47                             ` Linus Torvalds
  2018-03-06 22:29                               ` Steven Rostedt
@ 2018-03-12  8:22                               ` Ingo Molnar
  2018-03-12  9:00                                 ` Ard Biesheuvel
  1 sibling, 1 reply; 59+ messages in thread
From: Ingo Molnar @ 2018-03-12  8:22 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Steven Rostedt, Arnd Bergmann, Ard Biesheuvel, Daniel Micay,
	Kees Cook, Dave Hansen, Alexander Popov, Kernel Hardening,
	PaX Team, Brad Spengler, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Dominik Brodowski,
	Juergen Gross, Greg Kroah-Hartman, Dan Williams, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon, X86 ML,
	LKML


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, Mar 6, 2018 at 1:41 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > The warning would remain for the case where you don't enable this
> > hardening feature, so it wouldn't go away.
> 
> Side note: if in ten years we'd have a minimum gcc version that we
> could  just unconditionally say "auto (scalars) initialize to zero",
> then we'd just make that be the *semantics*, and the warning would
> obviously simply not ever be an issue.

Btw., I'd suggest we initialize aggregate types to zero as well, and then work 
from there by marking exceptions via attributes.

>From what I've seen over 90% of 'tricky' initialization sequences either don't 
matter to performance, or are unnecessarily complicated.

I.e. let's eliminate VLAs and let's also make the object initialization aspect of 
the C language reliably and broadly safe by default (via a GCC plugin) with no 
exceptions, and allow an opt-in mechanism for more fragile (but faster if coded 
correctly) constructs.

Is it possible to implement this "safe automatic variable initialization" language 
feature via a GCC plugin robustly, while still keeping code generation sane? (i.e. 
no forced allocation of stack slots, etc.) It should be a superset of 
CONFIG_GCC_PLUGIN_STRUCTLEAK=y.

Plugin support is present in GCC version 4.5 and higher, correct? So if such a 
plugin is possible we could raise the minimum GCC version to support it 
unconditionally.

I suspect a fair chunk of all kernel CVEs would go away if we fixed the C language 
this way.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-12  8:22                               ` Ingo Molnar
@ 2018-03-12  9:00                                 ` Ard Biesheuvel
  2018-03-12  9:21                                   ` Ingo Molnar
  0 siblings, 1 reply; 59+ messages in thread
From: Ard Biesheuvel @ 2018-03-12  9:00 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Steven Rostedt, Arnd Bergmann, Daniel Micay,
	Kees Cook, Dave Hansen, Alexander Popov, Kernel Hardening,
	PaX Team, Brad Spengler, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Dominik Brodowski,
	Juergen Gross, Greg Kroah-Hartman, Dan Williams, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon, X86 ML,
	LKML

On 12 March 2018 at 08:22, Ingo Molnar <mingo@kernel.org> wrote:
>
> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>> On Tue, Mar 6, 2018 at 1:41 PM, Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>> >
>> > The warning would remain for the case where you don't enable this
>> > hardening feature, so it wouldn't go away.
>>
>> Side note: if in ten years we'd have a minimum gcc version that we
>> could  just unconditionally say "auto (scalars) initialize to zero",
>> then we'd just make that be the *semantics*, and the warning would
>> obviously simply not ever be an issue.
>
> Btw., I'd suggest we initialize aggregate types to zero as well, and then work
> from there by marking exceptions via attributes.
>

I'd argue that we need to move to struct assignment for constructors,
similar to how checkpatch recommends it over memcpy()/memset().

That way, the compiler can tell that such a variable is being
assigned, and so it can warn in the usual way when it notices a code
path that does not involve an initialization. At the same time, it
will warn about not returning a value if there is a code path in the
constructor that results in no initialization being performed.

It should also help the compiler optimize by keeping such variables
entirely in registers if the address is never taken otherwise.

> From what I've seen over 90% of 'tricky' initialization sequences either don't
> matter to performance, or are unnecessarily complicated.
>
> I.e. let's eliminate VLAs and let's also make the object initialization aspect of
> the C language reliably and broadly safe by default (via a GCC plugin) with no
> exceptions, and allow an opt-in mechanism for more fragile (but faster if coded
> correctly) constructs.
>
> Is it possible to implement this "safe automatic variable initialization" language
> feature via a GCC plugin robustly, while still keeping code generation sane? (i.e.
> no forced allocation of stack slots, etc.) It should be a superset of
> CONFIG_GCC_PLUGIN_STRUCTLEAK=y.
>

I think that should be feasible, yes.

It would be worth trying whether the current code can be simplified,
though: it currently takes care not to add such an initialization if
it can already spot one, but I wonder whether just blindly adding one
at the start and letting the compiler optimize it away again is safer.

> Plugin support is present in GCC version 4.5 and higher, correct? So if such a
> plugin is possible we could raise the minimum GCC version to support it
> unconditionally.
>

I think that would be reasonable, yes, but we should check across
architectures as well.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter()
  2018-03-12  9:00                                 ` Ard Biesheuvel
@ 2018-03-12  9:21                                   ` Ingo Molnar
  0 siblings, 0 replies; 59+ messages in thread
From: Ingo Molnar @ 2018-03-12  9:21 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Linus Torvalds, Steven Rostedt, Arnd Bergmann, Daniel Micay,
	Kees Cook, Dave Hansen, Alexander Popov, Kernel Hardening,
	PaX Team, Brad Spengler, Andy Lutomirski, Tycho Andersen,
	Laura Abbott, Mark Rutland, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Peter Zijlstra,
	Dmitry V . Levin, Emese Revfy, Jonathan Corbet, Andrey Ryabinin,
	Kirill A . Shutemov, Thomas Garnier, Andrew Morton,
	Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Dominik Brodowski,
	Juergen Gross, Greg Kroah-Hartman, Dan Williams, Mathias Krause,
	Vikas Shivappa, Kyle Huey, Dmitry Safonov, Will Deacon, X86 ML,
	LKML


* Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:

> > Is it possible to implement this "safe automatic variable initialization" language
> > feature via a GCC plugin robustly, while still keeping code generation sane? (i.e.
> > no forced allocation of stack slots, etc.) It should be a superset of
> > CONFIG_GCC_PLUGIN_STRUCTLEAK=y.
> 
> I think that should be feasible, yes.
> 
> It would be worth trying whether the current code can be simplified, though: it 
> currently takes care not to add such an initialization if it can already spot 
> one, but I wonder whether just blindly adding one at the start and letting the 
> compiler optimize it away again is safer.

Absolutely - followed by:

 - a good look at the resulting code generation with a reasonably uptodate GCC

 - a look at the resulting code with older GCCs, to make sure there's no 
   pathological code generation

... because IMHO in the end it is the practical effects that will make (or break) 
any such attempt.

( BTW., initializing all automatic variables might even speed up certain
  optimization passes, FWIIW. )

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 2/7] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-03-05 20:25       ` Peter Zijlstra
  2018-03-05 21:21         ` Alexander Popov
@ 2018-03-21 11:04         ` Alexander Popov
  2018-03-21 15:33           ` Dave Hansen
  1 sibling, 1 reply; 59+ messages in thread
From: Alexander Popov @ 2018-03-21 11:04 UTC (permalink / raw)
  To: Peter Zijlstra, Laura Abbott, Dave Hansen, Linus Torvalds,
	Kees Cook, Andy Lutomirski
  Cc: PaX Team, Brad Spengler, Ingo Molnar, Tycho Andersen,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Dmitry V . Levin, Emese Revfy,
	Jonathan Corbet, Andrey Ryabinin, Kirill A . Shutemov,
	Thomas Garnier, Andrew Morton, Alexei Starovoitov, Josef Bacik,
	Masami Hiramatsu, Nicholas Piggin, Al Viro, David S . Miller,
	Ding Tianhong, David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, x86, linux-kernel,
	kernel-hardening

On 05.03.2018 23:25, Peter Zijlstra wrote:
> On Mon, Mar 05, 2018 at 11:43:19AM -0800, Laura Abbott wrote:
>> On 03/05/2018 08:41 AM, Dave Hansen wrote:
>>> On 03/03/2018 12:00 PM, Alexander Popov wrote:
>>>>   Documentation/x86/x86_64/mm.txt  |   2 +
>>>>   arch/Kconfig                     |  27 ++++++++++
>>>>   arch/x86/Kconfig                 |   1 +
>>>>   arch/x86/entry/entry_32.S        |  88 +++++++++++++++++++++++++++++++
>>>>   arch/x86/entry/entry_64.S        | 108 +++++++++++++++++++++++++++++++++++++++
>>>>   arch/x86/entry/entry_64_compat.S |  11 ++++
>>>
>>> This is a *lot* of assembly.  I wonder if you tried at all to get more
>>> of this into C or whether you just inherited the assembly from the
>>> original code?
>>>
>>
>> This came up previously http://www.openwall.com/lists/kernel-hardening/2017/10/23/5
>> there were concerns about trusting C to do the right thing as well as
>> speed.
> 
> And therefore the answer to this obvious question should've been part of
> the Changelog :-)
> 
> Dave is last in a long line of people asking this same question.

Hello! I've decided to share the details (and ask for advice) regardless of the
destiny of this patch series.

I've rewritten the assembly part in C, please see the code below. That is
erase_kstack() function, which is called at the end of syscall just before
returning to the userspace.

The generated asm doesn't look nice (and might be somewhat slower), but I don't
care now.

The main obstacle:
erase_kstack() must save and restore any modified registers, because it is
called from the trampoline stack (introduced by Andy Lutomirski), when all
registers except RDI are live.

Laura had a similar issue with C code on ARM:
http://www.openwall.com/lists/kernel-hardening/2017/10/10/3

I've solved that with no_caller_saved_registers attribute, which makes all
registers callee-saved. But that attribute was introduced only in gcc-7.

Does kernel have a solution for similar issues?
Thanks!

-------- >8 --------

#include <linux/bug.h>
#include <linux/sched.h>
#include <asm/current.h>
#include <asm/linkage.h>
#include <asm/processor.h>

/* This function must save and restore any modified registers */
__attribute__ ((no_caller_saved_registers)) asmlinkage void erase_kstack(void)
{
	register unsigned long p = current->thread.lowest_stack;
	register unsigned long boundary = p & ~(THREAD_SIZE - 1);
	unsigned long poison = 0;
	unsigned long check_depth = STACKLEAK_POISON_CHECK_DEPTH /
						sizeof(unsigned long);

	/*
	 * Two qwords at the bottom of the thread stack are reserved and
	 * should not be poisoned (see CONFIG_SCHED_STACK_END_CHECK).
	 */
	boundary += 2 * sizeof(unsigned long);

	/*
	 * Let's search for the poison value in the stack.
	 * Start from the lowest_stack and go to the bottom.
	 */
	while (p >= boundary && poison <= check_depth) {
		if (*(unsigned long *)p == STACKLEAK_POISON)
			poison++;
		else
			poison = 0;

		p -= sizeof(unsigned long);
	}

#ifdef CONFIG_STACKLEAK_METRICS
	current->thread.prev_lowest_stack = p;
#endif

	/*
	 * So let's write the poison value to the kernel stack. Start from
	 * the address in p and move up till the new boundary.
	 */
	if (on_thread_stack())
		boundary = current_stack_pointer;
	else
		boundary = current_top_of_stack();

	BUG_ON(boundary - p >= THREAD_SIZE);

	while (p < boundary) {
		*(unsigned long *)p = STACKLEAK_POISON;
		p += sizeof(unsigned long);
	}

	/* Reset the lowest_stack value for the next syscall */
	current->thread.lowest_stack = current_top_of_stack() - 256;
}

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 2/7] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-03-21 11:04         ` Alexander Popov
@ 2018-03-21 15:33           ` Dave Hansen
  2018-03-22 20:56             ` Alexander Popov
  0 siblings, 1 reply; 59+ messages in thread
From: Dave Hansen @ 2018-03-21 15:33 UTC (permalink / raw)
  To: alex.popov, Peter Zijlstra, Laura Abbott, Linus Torvalds,
	Kees Cook, Andy Lutomirski
  Cc: PaX Team, Brad Spengler, Ingo Molnar, Tycho Andersen,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Dmitry V . Levin, Emese Revfy,
	Jonathan Corbet, Andrey Ryabinin, Kirill A . Shutemov,
	Thomas Garnier, Andrew Morton, Alexei Starovoitov, Josef Bacik,
	Masami Hiramatsu, Nicholas Piggin, Al Viro, David S . Miller,
	Ding Tianhong, David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, x86, linux-kernel,
	kernel-hardening

On 03/21/2018 04:04 AM, Alexander Popov wrote:
> The main obstacle:
> erase_kstack() must save and restore any modified registers, because it is
> called from the trampoline stack (introduced by Andy Lutomirski), when all
> registers except RDI are live.

Wow, cool, thanks for doing this!

PTI might also cause you some problems here because it probably won't
map your function.  Did you have to put it in one of the sections that
gets mapped by the user page tables?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 2/7] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-03-21 15:33           ` Dave Hansen
@ 2018-03-22 20:56             ` Alexander Popov
  2018-03-26 17:32               ` Kees Cook
  0 siblings, 1 reply; 59+ messages in thread
From: Alexander Popov @ 2018-03-22 20:56 UTC (permalink / raw)
  To: Dave Hansen, Peter Zijlstra, Laura Abbott, Linus Torvalds,
	Kees Cook, Andy Lutomirski
  Cc: PaX Team, Brad Spengler, Ingo Molnar, Tycho Andersen,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Dmitry V . Levin, Emese Revfy,
	Jonathan Corbet, Andrey Ryabinin, Kirill A . Shutemov,
	Thomas Garnier, Andrew Morton, Alexei Starovoitov, Josef Bacik,
	Masami Hiramatsu, Nicholas Piggin, Al Viro, David S . Miller,
	Ding Tianhong, David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, x86, linux-kernel,
	kernel-hardening

On 21.03.2018 18:33, Dave Hansen wrote:
> On 03/21/2018 04:04 AM, Alexander Popov wrote:
>> The main obstacle:
>> erase_kstack() must save and restore any modified registers, because it is
>> called from the trampoline stack (introduced by Andy Lutomirski), when all
>> registers except RDI are live.
> 
> Wow, cool, thanks for doing this!
> 
> PTI might also cause you some problems here because it probably won't
> map your function.  Did you have to put it in one of the sections that
> gets mapped by the user page tables?

No, I didn't have to do that: erase_kstack() works fine, it is called just
before SWITCH_TO_USER_CR3_STACK.

There is also a way not to offend KASAN. erase_kstack() C code can be put in a
separate source file and compiled with "KASAN_SANITIZE_erase.o := n".

So, as I wrote, the only critical drawback of the C implementation is that it
needs no_caller_saved_registers attribute, which is provided by gcc since version 7.

Can you recommend any solution?


By the way, during my work on STACKLEAK, I've found one case when we get to the
userspace directly from the thread stack. Please see sysret32_from_system_call
in entry_64_compat.S. I checked that.

IMO it seems odd, can the adversary use that to bypass PTI?

Best regards,
Alexander

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 2/7] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-03-22 20:56             ` Alexander Popov
@ 2018-03-26 17:32               ` Kees Cook
  2018-03-26 17:43                 ` Andy Lutomirski
  0 siblings, 1 reply; 59+ messages in thread
From: Kees Cook @ 2018-03-26 17:32 UTC (permalink / raw)
  To: alex.popov, Dave Hansen
  Cc: Peter Zijlstra, Laura Abbott, Linus Torvalds, Andy Lutomirski,
	PaX Team, Brad Spengler, Ingo Molnar, Tycho Andersen,
	Mark Rutland, Ard Biesheuvel, Borislav Petkov, Richard Sandiford,
	Thomas Gleixner, H . Peter Anvin, Dmitry V . Levin, Emese Revfy,
	Jonathan Corbet, Andrey Ryabinin, Kirill A . Shutemov,
	Thomas Garnier, Andrew Morton, Alexei Starovoitov, Josef Bacik,
	Masami Hiramatsu, Nicholas Piggin, Al Viro, David S . Miller,
	Ding Tianhong, David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, x86, linux-kernel,
	kernel-hardening

On Thu, Mar 22, 2018 at 1:56 PM, Alexander Popov <alex.popov@linux.com> wrote:
> By the way, during my work on STACKLEAK, I've found one case when we get to the
> userspace directly from the thread stack. Please see sysret32_from_system_call
> in entry_64_compat.S. I checked that.
>
> IMO it seems odd, can the adversary use that to bypass PTI?

If it was missing the page table swap, shouldn't this mean that the
missing NX bit would immediately crash userspace?

-Kees

-- 
Kees Cook
Pixel Security

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH RFC v9 2/7] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls
  2018-03-26 17:32               ` Kees Cook
@ 2018-03-26 17:43                 ` Andy Lutomirski
  0 siblings, 0 replies; 59+ messages in thread
From: Andy Lutomirski @ 2018-03-26 17:43 UTC (permalink / raw)
  To: Kees Cook
  Cc: Alexander Popov, Dave Hansen, Peter Zijlstra, Laura Abbott,
	Linus Torvalds, Andy Lutomirski, PaX Team, Brad Spengler,
	Ingo Molnar, Tycho Andersen, Mark Rutland, Ard Biesheuvel,
	Borislav Petkov, Richard Sandiford, Thomas Gleixner,
	H . Peter Anvin, Dmitry V . Levin, Emese Revfy, Jonathan Corbet,
	Andrey Ryabinin, Kirill A . Shutemov, Thomas Garnier,
	Andrew Morton, Alexei Starovoitov, Josef Bacik, Masami Hiramatsu,
	Nicholas Piggin, Al Viro, David S . Miller, Ding Tianhong,
	David Woodhouse, Josh Poimboeuf, Steven Rostedt,
	Dominik Brodowski, Juergen Gross, Greg Kroah-Hartman,
	Dan Williams, Mathias Krause, Vikas Shivappa, Kyle Huey,
	Dmitry Safonov, Will Deacon, Arnd Bergmann, X86 ML, LKML,
	kernel-hardening

On Mon, Mar 26, 2018 at 5:32 PM, Kees Cook <keescook@chromium.org> wrote:
> On Thu, Mar 22, 2018 at 1:56 PM, Alexander Popov <alex.popov@linux.com> wrote:
>> By the way, during my work on STACKLEAK, I've found one case when we get to the
>> userspace directly from the thread stack. Please see sysret32_from_system_call
>> in entry_64_compat.S. I checked that.
>>
>> IMO it seems odd, can the adversary use that to bypass PTI?
>
> If it was missing the page table swap, shouldn't this mean that the
> missing NX bit would immediately crash userspace?

sysret32_from_system_call does;

    SWITCH_TO_USER_CR3_NOSTACK scratch_reg=%r8 scratch_reg2=%r9

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2018-03-26 17:43 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-03 20:00 [PATCH RFC v9 0/7] Introduce the STACKLEAK feature and a test for it Alexander Popov
2018-03-03 20:00 ` [PATCH RFC v9 1/7] gcc-plugins: Clean up the cgraph_create_edge* macros Alexander Popov
2018-03-03 20:00 ` [PATCH RFC v9 2/7] x86/entry: Add STACKLEAK erasing the kernel stack at the end of syscalls Alexander Popov
2018-03-05 16:41   ` Dave Hansen
2018-03-05 19:43     ` Laura Abbott
2018-03-05 19:50       ` Dave Hansen
2018-03-05 20:25       ` Peter Zijlstra
2018-03-05 21:21         ` Alexander Popov
2018-03-05 21:36           ` Kees Cook
2018-03-21 11:04         ` Alexander Popov
2018-03-21 15:33           ` Dave Hansen
2018-03-22 20:56             ` Alexander Popov
2018-03-26 17:32               ` Kees Cook
2018-03-26 17:43                 ` Andy Lutomirski
2018-03-03 20:00 ` [PATCH RFC v9 3/7] gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack Alexander Popov
2018-03-03 20:00 ` [PATCH RFC v9 4/7] x86/entry: Erase kernel stack in syscall_trace_enter() Alexander Popov
2018-03-05 19:40   ` Dave Hansen
2018-03-05 20:06     ` Kees Cook
2018-03-05 20:15       ` Linus Torvalds
2018-03-05 21:02         ` Alexander Popov
2018-03-05 21:02         ` Kees Cook
2018-03-05 21:40           ` Linus Torvalds
2018-03-05 22:07             ` Linus Torvalds
2018-03-06  0:56             ` Kees Cook
2018-03-06  4:30               ` Linus Torvalds
2018-03-06 17:58                 ` Andy Lutomirski
2018-03-06  7:56               ` [OLD PATCH] net: recvmsg: Unconditionally zero struct sockaddr_storage " Ingo Molnar
2018-03-06  8:08           ` Ingo Molnar
2018-03-06 15:16             ` Daniel Micay
2018-03-06 15:28               ` Daniel Micay
2018-03-06 18:56               ` Linus Torvalds
2018-03-06 19:07                 ` Peter Zijlstra
2018-03-06 19:07                 ` Ard Biesheuvel
2018-03-06 19:16                   ` Linus Torvalds
2018-03-06 20:42                     ` Arnd Bergmann
2018-03-06 21:01                       ` Linus Torvalds
2018-03-06 21:21                         ` Arnd Bergmann
2018-03-06 21:29                           ` Linus Torvalds
2018-03-06 22:09                             ` Arnd Bergmann
2018-03-06 22:24                               ` Linus Torvalds
2018-03-06 21:36                         ` Steven Rostedt
2018-03-06 21:41                           ` Linus Torvalds
2018-03-06 21:47                             ` Linus Torvalds
2018-03-06 22:29                               ` Steven Rostedt
2018-03-06 22:41                                 ` Linus Torvalds
2018-03-06 22:52                                   ` Steven Rostedt
2018-03-06 23:09                                     ` Linus Torvalds
2018-03-12  8:22                               ` Ingo Molnar
2018-03-12  9:00                                 ` Ard Biesheuvel
2018-03-12  9:21                                   ` Ingo Molnar
2018-03-06 21:47                           ` Arnd Bergmann
2018-03-06 22:19                             ` Linus Torvalds
2018-03-05 20:26       ` Peter Zijlstra
2018-03-03 20:00 ` [PATCH RFC v9 5/7] lkdtm: Add a test for STACKLEAK Alexander Popov
2018-03-03 20:00 ` [PATCH RFC v9 6/7] fs/proc: Show STACKLEAK metrics in the /proc file system Alexander Popov
2018-03-03 20:00 ` [PATCH RFC v9 7/7] doc: self-protection: Add information about STACKLEAK feature Alexander Popov
2018-03-05 19:34 ` [PATCH RFC v9 0/7] Introduce the STACKLEAK feature and a test for it Kees Cook
2018-03-05 19:42   ` Dave Hansen
2018-03-05 20:02     ` Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).