linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] kernel/fork: Add support for stack-end guard page
@ 2019-07-19 13:28 Marco Elver
  2019-07-19 13:28 ` [PATCH 2/2] lib/test_kasan: Add stack overflow test Marco Elver
  2019-07-23 16:41 ` [PATCH 1/2] kernel/fork: Add support for stack-end guard page Mark Rutland
  0 siblings, 2 replies; 12+ messages in thread
From: Marco Elver @ 2019-07-19 13:28 UTC (permalink / raw)
  To: elver
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Dmitry Vyukov, Andrey Konovalov, Mark Rutland, Peter Zijlstra,
	x86, kasan-dev

Enabling STACK_GUARD_PAGE helps catching kernel stack overflows immediately
rather than causing difficult-to-diagnose corruption. Note that, unlike
virtually-mapped kernel stacks, this will effectively waste an entire page of
memory; however, this feature may provide extra protection in cases that cannot
use virtually-mapped kernel stacks, at the cost of a page.

The motivation for this patch is that KASAN cannot use virtually-mapped kernel
stacks to detect stack overflows. An alternative would be implementing support
for vmapped stacks in KASAN, but would add significant extra complexity. While
the stack-end guard page approach here wastes a page, it is significantly
simpler than the alternative.  We assume that the extra cost of a page can be
justified in the cases where STACK_GUARD_PAGE would be enabled.

Note that in an earlier prototype of this patch, we used
'set_memory_{ro,rw}' functions, which flush the TLBs. This, however,
turned out to be unacceptably expensive, especially when run with
fuzzers such as Syzkaller, as the kernel would encounter frequent RCU
timeouts. The current approach of not flushing the TLB is therefore
best-effort, but works in the test cases considered -- any comments on
better alternatives or improvements are welcome.

Signed-off-by: Marco Elver <elver@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kasan-dev@googlegroups.com
---
 arch/Kconfig                         | 15 +++++++++++++++
 arch/x86/include/asm/page_64_types.h |  8 +++++++-
 include/linux/sched/task_stack.h     | 11 +++++++++--
 kernel/fork.c                        | 22 +++++++++++++++++++++-
 4 files changed, 52 insertions(+), 4 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index e8d19c3cb91f..cca3258fff1f 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -935,6 +935,21 @@ config LOCK_EVENT_COUNTS
 	  the chance of application behavior change because of timing
 	  differences. The counts are reported via debugfs.
 
+config STACK_GUARD_PAGE
+	default n
+	bool "Use stack-end page as guard page"
+	depends on !VMAP_STACK && ARCH_HAS_SET_DIRECT_MAP && THREAD_INFO_IN_TASK && !STACK_GROWSUP
+	help
+	  Enable this if you want to use the stack-end page as a guard page.
+	  This causes kernel stack overflows to be caught immediately rather
+	  than causing difficult-to-diagnose corruption. Note that, unlike
+	  virtually-mapped kernel stacks, this will effectively waste an entire
+	  page of memory; however, this feature may provide extra protection in
+	  cases that cannot use virtually-mapped kernel stacks, at the cost of
+	  a page. Note that, this option does not implicitly increase the
+	  default stack size. The main use-case is for KASAN to avoid reporting
+	  misleading bugs due to stack overflow.
+
 source "kernel/gcov/Kconfig"
 
 source "scripts/gcc-plugins/Kconfig"
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index 288b065955b7..b218b5713c02 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -12,8 +12,14 @@
 #define KASAN_STACK_ORDER 0
 #endif
 
+#ifdef CONFIG_STACK_GUARD_PAGE
+#define STACK_GUARD_SIZE PAGE_SIZE
+#else
+#define STACK_GUARD_SIZE 0
+#endif
+
 #define THREAD_SIZE_ORDER	(2 + KASAN_STACK_ORDER)
-#define THREAD_SIZE  (PAGE_SIZE << THREAD_SIZE_ORDER)
+#define THREAD_SIZE  ((PAGE_SIZE << THREAD_SIZE_ORDER) - STACK_GUARD_SIZE)
 
 #define EXCEPTION_STACK_ORDER (0 + KASAN_STACK_ORDER)
 #define EXCEPTION_STKSZ (PAGE_SIZE << EXCEPTION_STACK_ORDER)
diff --git a/include/linux/sched/task_stack.h b/include/linux/sched/task_stack.h
index 2413427e439c..7ee86ad0a282 100644
--- a/include/linux/sched/task_stack.h
+++ b/include/linux/sched/task_stack.h
@@ -11,6 +11,13 @@
 
 #ifdef CONFIG_THREAD_INFO_IN_TASK
 
+#ifndef STACK_GUARD_SIZE
+#ifdef CONFIG_STACK_GUARD_PAGE
+#error "Architecture not compatible with STACK_GUARD_PAGE"
+#endif
+#define STACK_GUARD_SIZE 0
+#endif
+
 /*
  * When accessing the stack of a non-current task that might exit, use
  * try_get_task_stack() instead.  task_stack_page will return a pointer
@@ -18,14 +25,14 @@
  */
 static inline void *task_stack_page(const struct task_struct *task)
 {
-	return task->stack;
+	return task->stack + STACK_GUARD_SIZE;
 }
 
 #define setup_thread_stack(new,old)	do { } while(0)
 
 static inline unsigned long *end_of_stack(const struct task_struct *task)
 {
-	return task->stack;
+	return task->stack + STACK_GUARD_SIZE;
 }
 
 #elif !defined(__HAVE_THREAD_FUNCTIONS)
diff --git a/kernel/fork.c b/kernel/fork.c
index d8ae0f1b4148..22033b03f7da 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -94,6 +94,7 @@
 #include <linux/livepatch.h>
 #include <linux/thread_info.h>
 #include <linux/stackleak.h>
+#include <linux/set_memory.h>
 
 #include <asm/pgtable.h>
 #include <asm/pgalloc.h>
@@ -249,6 +250,14 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node)
 					     THREAD_SIZE_ORDER);
 
 	if (likely(page)) {
+		if (IS_ENABLED(CONFIG_STACK_GUARD_PAGE)) {
+			/*
+			 * Best effort: do not flush TLB to avoid the overhead
+			 * of flushing all TLBs.
+			 */
+			set_direct_map_invalid_noflush(page);
+		}
+
 		tsk->stack = page_address(page);
 		return tsk->stack;
 	}
@@ -258,6 +267,7 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node)
 
 static inline void free_thread_stack(struct task_struct *tsk)
 {
+	struct page* stack_page;
 #ifdef CONFIG_VMAP_STACK
 	struct vm_struct *vm = task_stack_vm_area(tsk);
 
@@ -285,7 +295,17 @@ static inline void free_thread_stack(struct task_struct *tsk)
 	}
 #endif
 
-	__free_pages(virt_to_page(tsk->stack), THREAD_SIZE_ORDER);
+	stack_page = virt_to_page(tsk->stack);
+
+	if (IS_ENABLED(CONFIG_STACK_GUARD_PAGE)) {
+		/*
+		 * Avoid flushing TLBs, and instead rely on spurious fault
+		 * detection of stale TLBs.
+		 */
+		set_direct_map_default_noflush(stack_page);
+	}
+
+	__free_pages(stack_page, THREAD_SIZE_ORDER);
 }
 # else
 static struct kmem_cache *thread_stack_cache;
-- 
2.22.0.657.g960e92d24f-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] lib/test_kasan: Add stack overflow test
  2019-07-19 13:28 [PATCH 1/2] kernel/fork: Add support for stack-end guard page Marco Elver
@ 2019-07-19 13:28 ` Marco Elver
  2019-07-23 16:24   ` Mark Rutland
  2019-07-23 16:41 ` [PATCH 1/2] kernel/fork: Add support for stack-end guard page Mark Rutland
  1 sibling, 1 reply; 12+ messages in thread
From: Marco Elver @ 2019-07-19 13:28 UTC (permalink / raw)
  To: elver
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Dmitry Vyukov, Andrey Konovalov, Mark Rutland, Peter Zijlstra,
	x86, kasan-dev

Adds a simple stack overflow test, to check the error being reported on
an overflow. Without CONFIG_STACK_GUARD_PAGE, the result is typically
some seemingly unrelated KASAN error message due to accessing random
other memory.

Signed-off-by: Marco Elver <elver@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: kasan-dev@googlegroups.com
---
 lib/test_kasan.c | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/lib/test_kasan.c b/lib/test_kasan.c
index b63b367a94e8..3092ec01189d 100644
--- a/lib/test_kasan.c
+++ b/lib/test_kasan.c
@@ -15,6 +15,7 @@
 #include <linux/mman.h>
 #include <linux/module.h>
 #include <linux/printk.h>
+#include <linux/sched/task_stack.h>
 #include <linux/slab.h>
 #include <linux/string.h>
 #include <linux/uaccess.h>
@@ -709,6 +710,32 @@ static noinline void __init kmalloc_double_kzfree(void)
 	kzfree(ptr);
 }
 
+#ifdef CONFIG_STACK_GUARD_PAGE
+static noinline void __init stack_overflow_via_recursion(void)
+{
+	volatile int n = 512;
+
+	BUILD_BUG_ON(IS_ENABLED(CONFIG_STACK_GROWSUP));
+
+	/* About to overflow: overflow via alloca'd array and try to write. */
+	if (!object_is_on_stack((void *)&n - n)) {
+		volatile char overflow[n];
+
+		overflow[0] = overflow[0];
+		return;
+	}
+
+	stack_overflow_via_recursion();
+}
+
+static noinline void __init kasan_stack_overflow(void)
+{
+	pr_info("stack overflow begin\n");
+	stack_overflow_via_recursion();
+	pr_info("stack overflow end\n");
+}
+#endif
+
 static int __init kmalloc_tests_init(void)
 {
 	/*
@@ -753,6 +780,15 @@ static int __init kmalloc_tests_init(void)
 	kasan_bitops();
 	kmalloc_double_kzfree();
 
+#ifdef CONFIG_STACK_GUARD_PAGE
+	/*
+	 * Only test with CONFIG_STACK_GUARD_PAGE, as without we get other
+	 * random KASAN violations, due to accessing other random memory (we
+	 * want to avoid actually corrupting memory in these tests).
+	 */
+	kasan_stack_overflow();
+#endif
+
 	kasan_restore_multi_shot(multishot);
 
 	return -EAGAIN;
-- 
2.22.0.657.g960e92d24f-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] lib/test_kasan: Add stack overflow test
  2019-07-19 13:28 ` [PATCH 2/2] lib/test_kasan: Add stack overflow test Marco Elver
@ 2019-07-23 16:24   ` Mark Rutland
  2019-07-23 16:49     ` Marco Elver
  0 siblings, 1 reply; 12+ messages in thread
From: Mark Rutland @ 2019-07-23 16:24 UTC (permalink / raw)
  To: Marco Elver
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Dmitry Vyukov, Andrey Konovalov, Peter Zijlstra, x86, kasan-dev

On Fri, Jul 19, 2019 at 03:28:18PM +0200, Marco Elver wrote:
> Adds a simple stack overflow test, to check the error being reported on
> an overflow. Without CONFIG_STACK_GUARD_PAGE, the result is typically
> some seemingly unrelated KASAN error message due to accessing random
> other memory.

Can't we use the LKDTM_EXHAUST_STACK case to check this?

I was also under the impression that the other KASAN self-tests weren't
fatal, and IIUC this will kill the kernel.

Given that, and given this is testing non-KASAN functionality, I'm not
sure it makes sense to bundle this with the KASAN tests.

Thanks,
Mark.

> 
> Signed-off-by: Marco Elver <elver@google.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Cc: Alexander Potapenko <glider@google.com>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Andrey Konovalov <andreyknvl@google.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: x86@kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: kasan-dev@googlegroups.com
> ---
>  lib/test_kasan.c | 36 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 36 insertions(+)
> 
> diff --git a/lib/test_kasan.c b/lib/test_kasan.c
> index b63b367a94e8..3092ec01189d 100644
> --- a/lib/test_kasan.c
> +++ b/lib/test_kasan.c
> @@ -15,6 +15,7 @@
>  #include <linux/mman.h>
>  #include <linux/module.h>
>  #include <linux/printk.h>
> +#include <linux/sched/task_stack.h>
>  #include <linux/slab.h>
>  #include <linux/string.h>
>  #include <linux/uaccess.h>
> @@ -709,6 +710,32 @@ static noinline void __init kmalloc_double_kzfree(void)
>  	kzfree(ptr);
>  }
>  
> +#ifdef CONFIG_STACK_GUARD_PAGE
> +static noinline void __init stack_overflow_via_recursion(void)
> +{
> +	volatile int n = 512;
> +
> +	BUILD_BUG_ON(IS_ENABLED(CONFIG_STACK_GROWSUP));
> +
> +	/* About to overflow: overflow via alloca'd array and try to write. */
> +	if (!object_is_on_stack((void *)&n - n)) {
> +		volatile char overflow[n];
> +
> +		overflow[0] = overflow[0];
> +		return;
> +	}
> +
> +	stack_overflow_via_recursion();
> +}
> +
> +static noinline void __init kasan_stack_overflow(void)
> +{
> +	pr_info("stack overflow begin\n");
> +	stack_overflow_via_recursion();
> +	pr_info("stack overflow end\n");
> +}
> +#endif
> +
>  static int __init kmalloc_tests_init(void)
>  {
>  	/*
> @@ -753,6 +780,15 @@ static int __init kmalloc_tests_init(void)
>  	kasan_bitops();
>  	kmalloc_double_kzfree();
>  
> +#ifdef CONFIG_STACK_GUARD_PAGE
> +	/*
> +	 * Only test with CONFIG_STACK_GUARD_PAGE, as without we get other
> +	 * random KASAN violations, due to accessing other random memory (we
> +	 * want to avoid actually corrupting memory in these tests).
> +	 */
> +	kasan_stack_overflow();
> +#endif
> +
>  	kasan_restore_multi_shot(multishot);
>  
>  	return -EAGAIN;
> -- 
> 2.22.0.657.g960e92d24f-goog
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] kernel/fork: Add support for stack-end guard page
  2019-07-19 13:28 [PATCH 1/2] kernel/fork: Add support for stack-end guard page Marco Elver
  2019-07-19 13:28 ` [PATCH 2/2] lib/test_kasan: Add stack overflow test Marco Elver
@ 2019-07-23 16:41 ` Mark Rutland
  2019-07-24  9:11   ` Dmitry Vyukov
  1 sibling, 1 reply; 12+ messages in thread
From: Mark Rutland @ 2019-07-23 16:41 UTC (permalink / raw)
  To: Marco Elver
  Cc: linux-kernel, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Dmitry Vyukov, Andrey Konovalov, Peter Zijlstra, x86, kasan-dev

On Fri, Jul 19, 2019 at 03:28:17PM +0200, Marco Elver wrote:
> Enabling STACK_GUARD_PAGE helps catching kernel stack overflows immediately
> rather than causing difficult-to-diagnose corruption. Note that, unlike
> virtually-mapped kernel stacks, this will effectively waste an entire page of
> memory; however, this feature may provide extra protection in cases that cannot
> use virtually-mapped kernel stacks, at the cost of a page.
> 
> The motivation for this patch is that KASAN cannot use virtually-mapped kernel
> stacks to detect stack overflows. An alternative would be implementing support
> for vmapped stacks in KASAN, but would add significant extra complexity.

Do we have an idea as to how much additional complexity?

> While the stack-end guard page approach here wastes a page, it is
> significantly simpler than the alternative.  We assume that the extra
> cost of a page can be justified in the cases where STACK_GUARD_PAGE
> would be enabled.
> 
> Note that in an earlier prototype of this patch, we used
> 'set_memory_{ro,rw}' functions, which flush the TLBs. This, however,
> turned out to be unacceptably expensive, especially when run with
> fuzzers such as Syzkaller, as the kernel would encounter frequent RCU
> timeouts. The current approach of not flushing the TLB is therefore
> best-effort, but works in the test cases considered -- any comments on
> better alternatives or improvements are welcome.

Ouch. I don't think that necessarily applies to other architectures, and
from my PoV it would be nicer if we could rely on regular vmap'd stacks.
That way we have one code path, and we can rely on the fault.

> 
> Signed-off-by: Marco Elver <elver@google.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Cc: Alexander Potapenko <glider@google.com>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Andrey Konovalov <andreyknvl@google.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: x86@kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: kasan-dev@googlegroups.com
> ---
>  arch/Kconfig                         | 15 +++++++++++++++
>  arch/x86/include/asm/page_64_types.h |  8 +++++++-
>  include/linux/sched/task_stack.h     | 11 +++++++++--
>  kernel/fork.c                        | 22 +++++++++++++++++++++-
>  4 files changed, 52 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/Kconfig b/arch/Kconfig
> index e8d19c3cb91f..cca3258fff1f 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -935,6 +935,21 @@ config LOCK_EVENT_COUNTS
>  	  the chance of application behavior change because of timing
>  	  differences. The counts are reported via debugfs.
>  
> +config STACK_GUARD_PAGE
> +	default n
> +	bool "Use stack-end page as guard page"
> +	depends on !VMAP_STACK && ARCH_HAS_SET_DIRECT_MAP && THREAD_INFO_IN_TASK && !STACK_GROWSUP
> +	help
> +	  Enable this if you want to use the stack-end page as a guard page.
> +	  This causes kernel stack overflows to be caught immediately rather
> +	  than causing difficult-to-diagnose corruption. Note that, unlike
> +	  virtually-mapped kernel stacks, this will effectively waste an entire
> +	  page of memory; however, this feature may provide extra protection in
> +	  cases that cannot use virtually-mapped kernel stacks, at the cost of
> +	  a page. Note that, this option does not implicitly increase the
> +	  default stack size. The main use-case is for KASAN to avoid reporting
> +	  misleading bugs due to stack overflow.

These dependencies can also be satisfied on arm64, but I don't believe
this will work correctly there, and we'll need something like a
ARCH_HAS_STACK_GUARD_PAGE symbol so that x86 can opt-in.

On arm64 our exception vectors don't specify an alternative stack, so we
don't have a direct equivalent to x86 double-fault handler. Our kernel
stack overflow handling requires explicit tests in the entry assembly
that are only built (or valid) when VMAP_STACK is selected.

> +
>  source "kernel/gcov/Kconfig"
>  
>  source "scripts/gcc-plugins/Kconfig"
> diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
> index 288b065955b7..b218b5713c02 100644
> --- a/arch/x86/include/asm/page_64_types.h
> +++ b/arch/x86/include/asm/page_64_types.h
> @@ -12,8 +12,14 @@
>  #define KASAN_STACK_ORDER 0
>  #endif
>  
> +#ifdef CONFIG_STACK_GUARD_PAGE
> +#define STACK_GUARD_SIZE PAGE_SIZE
> +#else
> +#define STACK_GUARD_SIZE 0
> +#endif
> +
>  #define THREAD_SIZE_ORDER	(2 + KASAN_STACK_ORDER)
> -#define THREAD_SIZE  (PAGE_SIZE << THREAD_SIZE_ORDER)
> +#define THREAD_SIZE  ((PAGE_SIZE << THREAD_SIZE_ORDER) - STACK_GUARD_SIZE)

I'm pretty sure that common code relies on THREAD_SIZE being a
power-of-two. I also know that if we wanted to enable this on arm64 that
would very likely be a requirement.

For example, in kernel/trace/trace_stack.c we have:

| this_size = ((unsigned long)stack) & (THREAD_SIZE-1);

... and INIT_TASK_DATA() allocates the initial task stack using
THREAD_SIZE, so that may require special care, as it might not be sized
or aligned as you expect.

>  
>  #define EXCEPTION_STACK_ORDER (0 + KASAN_STACK_ORDER)
>  #define EXCEPTION_STKSZ (PAGE_SIZE << EXCEPTION_STACK_ORDER)
> diff --git a/include/linux/sched/task_stack.h b/include/linux/sched/task_stack.h
> index 2413427e439c..7ee86ad0a282 100644
> --- a/include/linux/sched/task_stack.h
> +++ b/include/linux/sched/task_stack.h
> @@ -11,6 +11,13 @@
>  
>  #ifdef CONFIG_THREAD_INFO_IN_TASK
>  
> +#ifndef STACK_GUARD_SIZE
> +#ifdef CONFIG_STACK_GUARD_PAGE
> +#error "Architecture not compatible with STACK_GUARD_PAGE"
> +#endif
> +#define STACK_GUARD_SIZE 0
> +#endif

The core code you add assumes that when enabled, this is PAGE_SIZE, so
I think the definition should live in a common header.

As above, it should not be possible to select CONFIG_STACK_GUARD_PAGE
unless the architecture supports it. If nothing else, this avoids
getting bug reports on randconfigs.

Thanks,
Mark.

> +
>  /*
>   * When accessing the stack of a non-current task that might exit, use
>   * try_get_task_stack() instead.  task_stack_page will return a pointer
> @@ -18,14 +25,14 @@
>   */
>  static inline void *task_stack_page(const struct task_struct *task)
>  {
> -	return task->stack;
> +	return task->stack + STACK_GUARD_SIZE;
>  }
>  
>  #define setup_thread_stack(new,old)	do { } while(0)
>  
>  static inline unsigned long *end_of_stack(const struct task_struct *task)
>  {
> -	return task->stack;
> +	return task->stack + STACK_GUARD_SIZE;
>  }
>  
>  #elif !defined(__HAVE_THREAD_FUNCTIONS)
> diff --git a/kernel/fork.c b/kernel/fork.c
> index d8ae0f1b4148..22033b03f7da 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -94,6 +94,7 @@
>  #include <linux/livepatch.h>
>  #include <linux/thread_info.h>
>  #include <linux/stackleak.h>
> +#include <linux/set_memory.h>
>  
>  #include <asm/pgtable.h>
>  #include <asm/pgalloc.h>
> @@ -249,6 +250,14 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node)
>  					     THREAD_SIZE_ORDER);
>  
>  	if (likely(page)) {
> +		if (IS_ENABLED(CONFIG_STACK_GUARD_PAGE)) {
> +			/*
> +			 * Best effort: do not flush TLB to avoid the overhead
> +			 * of flushing all TLBs.
> +			 */
> +			set_direct_map_invalid_noflush(page);
> +		}
> +
>  		tsk->stack = page_address(page);
>  		return tsk->stack;
>  	}
> @@ -258,6 +267,7 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node)
>  
>  static inline void free_thread_stack(struct task_struct *tsk)
>  {
> +	struct page* stack_page;
>  #ifdef CONFIG_VMAP_STACK
>  	struct vm_struct *vm = task_stack_vm_area(tsk);
>  
> @@ -285,7 +295,17 @@ static inline void free_thread_stack(struct task_struct *tsk)
>  	}
>  #endif
>  
> -	__free_pages(virt_to_page(tsk->stack), THREAD_SIZE_ORDER);
> +	stack_page = virt_to_page(tsk->stack);
> +
> +	if (IS_ENABLED(CONFIG_STACK_GUARD_PAGE)) {
> +		/*
> +		 * Avoid flushing TLBs, and instead rely on spurious fault
> +		 * detection of stale TLBs.
> +		 */
> +		set_direct_map_default_noflush(stack_page);
> +	}
> +
> +	__free_pages(stack_page, THREAD_SIZE_ORDER);
>  }
>  # else
>  static struct kmem_cache *thread_stack_cache;
> -- 
> 2.22.0.657.g960e92d24f-goog
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] lib/test_kasan: Add stack overflow test
  2019-07-23 16:24   ` Mark Rutland
@ 2019-07-23 16:49     ` Marco Elver
  2019-07-24 11:23       ` Mark Rutland
  0 siblings, 1 reply; 12+ messages in thread
From: Marco Elver @ 2019-07-23 16:49 UTC (permalink / raw)
  To: Mark Rutland
  Cc: LKML, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Dmitry Vyukov, Andrey Konovalov, Peter Zijlstra,
	the arch/x86 maintainers, kasan-dev

On Tue, 23 Jul 2019 at 18:24, Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Fri, Jul 19, 2019 at 03:28:18PM +0200, Marco Elver wrote:
> > Adds a simple stack overflow test, to check the error being reported on
> > an overflow. Without CONFIG_STACK_GUARD_PAGE, the result is typically
> > some seemingly unrelated KASAN error message due to accessing random
> > other memory.
>
> Can't we use the LKDTM_EXHAUST_STACK case to check this?
>
> I was also under the impression that the other KASAN self-tests weren't
> fatal, and IIUC this will kill the kernel.
>
> Given that, and given this is testing non-KASAN functionality, I'm not
> sure it makes sense to bundle this with the KASAN tests.

Thanks for pointing out LKDTM_EXHAUST_STACK.

This patch can be dropped!

-- Marco

> Thanks,
> Mark.
>
> >
> > Signed-off-by: Marco Elver <elver@google.com>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Borislav Petkov <bp@alien8.de>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
> > Cc: Alexander Potapenko <glider@google.com>
> > Cc: Dmitry Vyukov <dvyukov@google.com>
> > Cc: Andrey Konovalov <andreyknvl@google.com>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: x86@kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > Cc: kasan-dev@googlegroups.com
> > ---
> >  lib/test_kasan.c | 36 ++++++++++++++++++++++++++++++++++++
> >  1 file changed, 36 insertions(+)
> >
> > diff --git a/lib/test_kasan.c b/lib/test_kasan.c
> > index b63b367a94e8..3092ec01189d 100644
> > --- a/lib/test_kasan.c
> > +++ b/lib/test_kasan.c
> > @@ -15,6 +15,7 @@
> >  #include <linux/mman.h>
> >  #include <linux/module.h>
> >  #include <linux/printk.h>
> > +#include <linux/sched/task_stack.h>
> >  #include <linux/slab.h>
> >  #include <linux/string.h>
> >  #include <linux/uaccess.h>
> > @@ -709,6 +710,32 @@ static noinline void __init kmalloc_double_kzfree(void)
> >       kzfree(ptr);
> >  }
> >
> > +#ifdef CONFIG_STACK_GUARD_PAGE
> > +static noinline void __init stack_overflow_via_recursion(void)
> > +{
> > +     volatile int n = 512;
> > +
> > +     BUILD_BUG_ON(IS_ENABLED(CONFIG_STACK_GROWSUP));
> > +
> > +     /* About to overflow: overflow via alloca'd array and try to write. */
> > +     if (!object_is_on_stack((void *)&n - n)) {
> > +             volatile char overflow[n];
> > +
> > +             overflow[0] = overflow[0];
> > +             return;
> > +     }
> > +
> > +     stack_overflow_via_recursion();
> > +}
> > +
> > +static noinline void __init kasan_stack_overflow(void)
> > +{
> > +     pr_info("stack overflow begin\n");
> > +     stack_overflow_via_recursion();
> > +     pr_info("stack overflow end\n");
> > +}
> > +#endif
> > +
> >  static int __init kmalloc_tests_init(void)
> >  {
> >       /*
> > @@ -753,6 +780,15 @@ static int __init kmalloc_tests_init(void)
> >       kasan_bitops();
> >       kmalloc_double_kzfree();
> >
> > +#ifdef CONFIG_STACK_GUARD_PAGE
> > +     /*
> > +      * Only test with CONFIG_STACK_GUARD_PAGE, as without we get other
> > +      * random KASAN violations, due to accessing other random memory (we
> > +      * want to avoid actually corrupting memory in these tests).
> > +      */
> > +     kasan_stack_overflow();
> > +#endif
> > +
> >       kasan_restore_multi_shot(multishot);
> >
> >       return -EAGAIN;
> > --
> > 2.22.0.657.g960e92d24f-goog
> >
>
> --
> You received this message because you are subscribed to the Google Groups "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20190723162403.GA56959%40lakrids.cambridge.arm.com.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] kernel/fork: Add support for stack-end guard page
  2019-07-23 16:41 ` [PATCH 1/2] kernel/fork: Add support for stack-end guard page Mark Rutland
@ 2019-07-24  9:11   ` Dmitry Vyukov
  2019-07-24 11:21     ` Mark Rutland
  0 siblings, 1 reply; 12+ messages in thread
From: Dmitry Vyukov @ 2019-07-24  9:11 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Marco Elver, LKML, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Andrey Konovalov, Peter Zijlstra, the arch/x86 maintainers,
	kasan-dev

On Tue, Jul 23, 2019 at 6:41 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Fri, Jul 19, 2019 at 03:28:17PM +0200, Marco Elver wrote:
> > Enabling STACK_GUARD_PAGE helps catching kernel stack overflows immediately
> > rather than causing difficult-to-diagnose corruption. Note that, unlike
> > virtually-mapped kernel stacks, this will effectively waste an entire page of
> > memory; however, this feature may provide extra protection in cases that cannot
> > use virtually-mapped kernel stacks, at the cost of a page.
> >
> > The motivation for this patch is that KASAN cannot use virtually-mapped kernel
> > stacks to detect stack overflows. An alternative would be implementing support
> > for vmapped stacks in KASAN, but would add significant extra complexity.
>
> Do we have an idea as to how much additional complexity?

We would need to map/unmap shadow for vmalloc region on stack
allocation/deallocation. We may need to track shadow pages that cover
both stack and an unused memory, or 2 different stacks, which are
mapped/unmapped at different times. This may have some concurrency
concerns.  Not sure what about page tables for other CPU, I've seen
some code that updates pages tables for vmalloc region lazily on page
faults. Not sure what about TLBs. Probably also some problems that I
can't thought about now.


> > While the stack-end guard page approach here wastes a page, it is
> > significantly simpler than the alternative.  We assume that the extra
> > cost of a page can be justified in the cases where STACK_GUARD_PAGE
> > would be enabled.
> >
> > Note that in an earlier prototype of this patch, we used
> > 'set_memory_{ro,rw}' functions, which flush the TLBs. This, however,
> > turned out to be unacceptably expensive, especially when run with
> > fuzzers such as Syzkaller, as the kernel would encounter frequent RCU
> > timeouts. The current approach of not flushing the TLB is therefore
> > best-effort, but works in the test cases considered -- any comments on
> > better alternatives or improvements are welcome.
>
> Ouch. I don't think that necessarily applies to other architectures, and
> from my PoV it would be nicer if we could rely on regular vmap'd stacks.
> That way we have one code path, and we can rely on the fault.
>
> >
> > Signed-off-by: Marco Elver <elver@google.com>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Borislav Petkov <bp@alien8.de>
> > Cc: "H. Peter Anvin" <hpa@zytor.com>
> > Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
> > Cc: Alexander Potapenko <glider@google.com>
> > Cc: Dmitry Vyukov <dvyukov@google.com>
> > Cc: Andrey Konovalov <andreyknvl@google.com>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: x86@kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > Cc: kasan-dev@googlegroups.com
> > ---
> >  arch/Kconfig                         | 15 +++++++++++++++
> >  arch/x86/include/asm/page_64_types.h |  8 +++++++-
> >  include/linux/sched/task_stack.h     | 11 +++++++++--
> >  kernel/fork.c                        | 22 +++++++++++++++++++++-
> >  4 files changed, 52 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/Kconfig b/arch/Kconfig
> > index e8d19c3cb91f..cca3258fff1f 100644
> > --- a/arch/Kconfig
> > +++ b/arch/Kconfig
> > @@ -935,6 +935,21 @@ config LOCK_EVENT_COUNTS
> >         the chance of application behavior change because of timing
> >         differences. The counts are reported via debugfs.
> >
> > +config STACK_GUARD_PAGE
> > +     default n
> > +     bool "Use stack-end page as guard page"
> > +     depends on !VMAP_STACK && ARCH_HAS_SET_DIRECT_MAP && THREAD_INFO_IN_TASK && !STACK_GROWSUP
> > +     help
> > +       Enable this if you want to use the stack-end page as a guard page.
> > +       This causes kernel stack overflows to be caught immediately rather
> > +       than causing difficult-to-diagnose corruption. Note that, unlike
> > +       virtually-mapped kernel stacks, this will effectively waste an entire
> > +       page of memory; however, this feature may provide extra protection in
> > +       cases that cannot use virtually-mapped kernel stacks, at the cost of
> > +       a page. Note that, this option does not implicitly increase the
> > +       default stack size. The main use-case is for KASAN to avoid reporting
> > +       misleading bugs due to stack overflow.
>
> These dependencies can also be satisfied on arm64, but I don't believe
> this will work correctly there, and we'll need something like a
> ARCH_HAS_STACK_GUARD_PAGE symbol so that x86 can opt-in.
>
> On arm64 our exception vectors don't specify an alternative stack, so we
> don't have a direct equivalent to x86 double-fault handler. Our kernel
> stack overflow handling requires explicit tests in the entry assembly
> that are only built (or valid) when VMAP_STACK is selected.
>
> > +
> >  source "kernel/gcov/Kconfig"
> >
> >  source "scripts/gcc-plugins/Kconfig"
> > diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
> > index 288b065955b7..b218b5713c02 100644
> > --- a/arch/x86/include/asm/page_64_types.h
> > +++ b/arch/x86/include/asm/page_64_types.h
> > @@ -12,8 +12,14 @@
> >  #define KASAN_STACK_ORDER 0
> >  #endif
> >
> > +#ifdef CONFIG_STACK_GUARD_PAGE
> > +#define STACK_GUARD_SIZE PAGE_SIZE
> > +#else
> > +#define STACK_GUARD_SIZE 0
> > +#endif
> > +
> >  #define THREAD_SIZE_ORDER    (2 + KASAN_STACK_ORDER)
> > -#define THREAD_SIZE  (PAGE_SIZE << THREAD_SIZE_ORDER)
> > +#define THREAD_SIZE  ((PAGE_SIZE << THREAD_SIZE_ORDER) - STACK_GUARD_SIZE)
>
> I'm pretty sure that common code relies on THREAD_SIZE being a
> power-of-two. I also know that if we wanted to enable this on arm64 that
> would very likely be a requirement.
>
> For example, in kernel/trace/trace_stack.c we have:
>
> | this_size = ((unsigned long)stack) & (THREAD_SIZE-1);
>
> ... and INIT_TASK_DATA() allocates the initial task stack using
> THREAD_SIZE, so that may require special care, as it might not be sized
> or aligned as you expect.


We've built it, booted it, stressed it, everything looked fine... that
should have been a build failure.
Is it a property that we need to preserve? Or we could fix the uses
that assume power-of-2?


> >  #define EXCEPTION_STACK_ORDER (0 + KASAN_STACK_ORDER)
> >  #define EXCEPTION_STKSZ (PAGE_SIZE << EXCEPTION_STACK_ORDER)
> > diff --git a/include/linux/sched/task_stack.h b/include/linux/sched/task_stack.h
> > index 2413427e439c..7ee86ad0a282 100644
> > --- a/include/linux/sched/task_stack.h
> > +++ b/include/linux/sched/task_stack.h
> > @@ -11,6 +11,13 @@
> >
> >  #ifdef CONFIG_THREAD_INFO_IN_TASK
> >
> > +#ifndef STACK_GUARD_SIZE
> > +#ifdef CONFIG_STACK_GUARD_PAGE
> > +#error "Architecture not compatible with STACK_GUARD_PAGE"
> > +#endif
> > +#define STACK_GUARD_SIZE 0
> > +#endif
>
> The core code you add assumes that when enabled, this is PAGE_SIZE, so
> I think the definition should live in a common header.
>
> As above, it should not be possible to select CONFIG_STACK_GUARD_PAGE
> unless the architecture supports it. If nothing else, this avoids
> getting bug reports on randconfigs.
>
> Thanks,
> Mark.
>
> > +
> >  /*
> >   * When accessing the stack of a non-current task that might exit, use
> >   * try_get_task_stack() instead.  task_stack_page will return a pointer
> > @@ -18,14 +25,14 @@
> >   */
> >  static inline void *task_stack_page(const struct task_struct *task)
> >  {
> > -     return task->stack;
> > +     return task->stack + STACK_GUARD_SIZE;
> >  }
> >
> >  #define setup_thread_stack(new,old)  do { } while(0)
> >
> >  static inline unsigned long *end_of_stack(const struct task_struct *task)
> >  {
> > -     return task->stack;
> > +     return task->stack + STACK_GUARD_SIZE;
> >  }
> >
> >  #elif !defined(__HAVE_THREAD_FUNCTIONS)
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index d8ae0f1b4148..22033b03f7da 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -94,6 +94,7 @@
> >  #include <linux/livepatch.h>
> >  #include <linux/thread_info.h>
> >  #include <linux/stackleak.h>
> > +#include <linux/set_memory.h>
> >
> >  #include <asm/pgtable.h>
> >  #include <asm/pgalloc.h>
> > @@ -249,6 +250,14 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node)
> >                                            THREAD_SIZE_ORDER);
> >
> >       if (likely(page)) {
> > +             if (IS_ENABLED(CONFIG_STACK_GUARD_PAGE)) {
> > +                     /*
> > +                      * Best effort: do not flush TLB to avoid the overhead
> > +                      * of flushing all TLBs.
> > +                      */
> > +                     set_direct_map_invalid_noflush(page);
> > +             }
> > +
> >               tsk->stack = page_address(page);
> >               return tsk->stack;
> >       }
> > @@ -258,6 +267,7 @@ static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int node)
> >
> >  static inline void free_thread_stack(struct task_struct *tsk)
> >  {
> > +     struct page* stack_page;
> >  #ifdef CONFIG_VMAP_STACK
> >       struct vm_struct *vm = task_stack_vm_area(tsk);
> >
> > @@ -285,7 +295,17 @@ static inline void free_thread_stack(struct task_struct *tsk)
> >       }
> >  #endif
> >
> > -     __free_pages(virt_to_page(tsk->stack), THREAD_SIZE_ORDER);
> > +     stack_page = virt_to_page(tsk->stack);
> > +
> > +     if (IS_ENABLED(CONFIG_STACK_GUARD_PAGE)) {
> > +             /*
> > +              * Avoid flushing TLBs, and instead rely on spurious fault
> > +              * detection of stale TLBs.
> > +              */
> > +             set_direct_map_default_noflush(stack_page);
> > +     }
> > +
> > +     __free_pages(stack_page, THREAD_SIZE_ORDER);
> >  }
> >  # else
> >  static struct kmem_cache *thread_stack_cache;
> > --
> > 2.22.0.657.g960e92d24f-goog
> >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] kernel/fork: Add support for stack-end guard page
  2019-07-24  9:11   ` Dmitry Vyukov
@ 2019-07-24 11:21     ` Mark Rutland
  2019-07-25  7:53       ` Dmitry Vyukov
  0 siblings, 1 reply; 12+ messages in thread
From: Mark Rutland @ 2019-07-24 11:21 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Marco Elver, LKML, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Andrey Konovalov, Peter Zijlstra, the arch/x86 maintainers,
	kasan-dev

On Wed, Jul 24, 2019 at 11:11:49AM +0200, Dmitry Vyukov wrote:
> On Tue, Jul 23, 2019 at 6:41 PM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Fri, Jul 19, 2019 at 03:28:17PM +0200, Marco Elver wrote:
> > > Enabling STACK_GUARD_PAGE helps catching kernel stack overflows immediately
> > > rather than causing difficult-to-diagnose corruption. Note that, unlike
> > > virtually-mapped kernel stacks, this will effectively waste an entire page of
> > > memory; however, this feature may provide extra protection in cases that cannot
> > > use virtually-mapped kernel stacks, at the cost of a page.
> > >
> > > The motivation for this patch is that KASAN cannot use virtually-mapped kernel
> > > stacks to detect stack overflows. An alternative would be implementing support
> > > for vmapped stacks in KASAN, but would add significant extra complexity.
> >
> > Do we have an idea as to how much additional complexity?
> 
> We would need to map/unmap shadow for vmalloc region on stack
> allocation/deallocation. We may need to track shadow pages that cover
> both stack and an unused memory, or 2 different stacks, which are
> mapped/unmapped at different times. This may have some concurrency
> concerns.  Not sure what about page tables for other CPU, I've seen
> some code that updates pages tables for vmalloc region lazily on page
> faults. Not sure what about TLBs. Probably also some problems that I
> can't thought about now.

Ok. So this looks big, we this hasn't been prototyped, so we don't have
a concrete idea. I agree that concurrency is likely to be painful. :)

[...]

> > > diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
> > > index 288b065955b7..b218b5713c02 100644
> > > --- a/arch/x86/include/asm/page_64_types.h
> > > +++ b/arch/x86/include/asm/page_64_types.h
> > > @@ -12,8 +12,14 @@
> > >  #define KASAN_STACK_ORDER 0
> > >  #endif
> > >
> > > +#ifdef CONFIG_STACK_GUARD_PAGE
> > > +#define STACK_GUARD_SIZE PAGE_SIZE
> > > +#else
> > > +#define STACK_GUARD_SIZE 0
> > > +#endif
> > > +
> > >  #define THREAD_SIZE_ORDER    (2 + KASAN_STACK_ORDER)
> > > -#define THREAD_SIZE  (PAGE_SIZE << THREAD_SIZE_ORDER)
> > > +#define THREAD_SIZE  ((PAGE_SIZE << THREAD_SIZE_ORDER) - STACK_GUARD_SIZE)
> >
> > I'm pretty sure that common code relies on THREAD_SIZE being a
> > power-of-two. I also know that if we wanted to enable this on arm64 that
> > would very likely be a requirement.
> >
> > For example, in kernel/trace/trace_stack.c we have:
> >
> > | this_size = ((unsigned long)stack) & (THREAD_SIZE-1);
> >
> > ... and INIT_TASK_DATA() allocates the initial task stack using
> > THREAD_SIZE, so that may require special care, as it might not be sized
> > or aligned as you expect.
> 
> We've built it, booted it, stressed it, everything looked fine... that
> should have been a build failure.

I think it's been an implicit assumption for so long that no-one saw the need
for built-time assertions where they depend on it. 

I also suspect that in practice there are paths that you won't have
stressed in your environment, e.g. in the ACPI wakeup path where we end
up calling:

/* Unpoison the stack for the current task beyond a watermark sp value. */
asmlinkage void kasan_unpoison_task_stack_below(const void *watermark)
{
	/*
	 * Calculate the task stack base address.  Avoid using 'current'
	 * because this function is called by early resume code which hasn't
	 * yet set up the percpu register (%gs).
	 */
	void *base = (void *)((unsigned long)watermark & ~(THREAD_SIZE - 1));

	kasan_unpoison_shadow(base, watermark - base);
} 

> Is it a property that we need to preserve? Or we could fix the uses
> that assume power-of-2?

Generally, I think that those can be fixed up. Someone just needs to dig
through how THREAD_SIZE and THREAD_SIZE_ORDER are used to generate or
manipulate addresses.

For local-task stuff, I think it's easy to rewrite in terms of
task_stack_page(), but I'm not entirely sure what we'd do for cases
where we look at another task, e.g.

static int proc_stack_depth(struct seq_file *m, struct pid_namespace *ns, 
                                struct pid *pid, struct task_struct *task)
{
        unsigned long prev_depth = THREAD_SIZE -
                                (task->prev_lowest_stack & (THREAD_SIZE - 1)); 
        unsigned long depth = THREAD_SIZE -
                                (task->lowest_stack & (THREAD_SIZE - 1)); 

        seq_printf(m, "previous stack depth: %lu\nstack depth: %lu\n",
                                                        prev_depth, depth);
        return 0;
}

... as I'm not sure of the lifetime of task->stack relative to task. I
know that with THREAD_INFO_IN_TASK the stack can be freed while the task
is still live.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] lib/test_kasan: Add stack overflow test
  2019-07-23 16:49     ` Marco Elver
@ 2019-07-24 11:23       ` Mark Rutland
  0 siblings, 0 replies; 12+ messages in thread
From: Mark Rutland @ 2019-07-24 11:23 UTC (permalink / raw)
  To: Marco Elver
  Cc: LKML, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Dmitry Vyukov, Andrey Konovalov, Peter Zijlstra,
	the arch/x86 maintainers, kasan-dev

On Tue, Jul 23, 2019 at 06:49:03PM +0200, Marco Elver wrote:
> On Tue, 23 Jul 2019 at 18:24, Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Fri, Jul 19, 2019 at 03:28:18PM +0200, Marco Elver wrote:
> > > Adds a simple stack overflow test, to check the error being reported on
> > > an overflow. Without CONFIG_STACK_GUARD_PAGE, the result is typically
> > > some seemingly unrelated KASAN error message due to accessing random
> > > other memory.
> >
> > Can't we use the LKDTM_EXHAUST_STACK case to check this?
> >
> > I was also under the impression that the other KASAN self-tests weren't
> > fatal, and IIUC this will kill the kernel.
> >
> > Given that, and given this is testing non-KASAN functionality, I'm not
> > sure it makes sense to bundle this with the KASAN tests.
> 
> Thanks for pointing out LKDTM_EXHAUST_STACK.
> 
> This patch can be dropped!

Cool; it's always nice to find the work has already been done! :)

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] kernel/fork: Add support for stack-end guard page
  2019-07-24 11:21     ` Mark Rutland
@ 2019-07-25  7:53       ` Dmitry Vyukov
  2019-07-25 10:14         ` Mark Rutland
  0 siblings, 1 reply; 12+ messages in thread
From: Dmitry Vyukov @ 2019-07-25  7:53 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Marco Elver, LKML, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Andrey Konovalov, Peter Zijlstra, the arch/x86 maintainers,
	kasan-dev, Daniel Axtens

On Wed, Jul 24, 2019 at 1:21 PM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Wed, Jul 24, 2019 at 11:11:49AM +0200, Dmitry Vyukov wrote:
> > On Tue, Jul 23, 2019 at 6:41 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > >
> > > On Fri, Jul 19, 2019 at 03:28:17PM +0200, Marco Elver wrote:
> > > > Enabling STACK_GUARD_PAGE helps catching kernel stack overflows immediately
> > > > rather than causing difficult-to-diagnose corruption. Note that, unlike
> > > > virtually-mapped kernel stacks, this will effectively waste an entire page of
> > > > memory; however, this feature may provide extra protection in cases that cannot
> > > > use virtually-mapped kernel stacks, at the cost of a page.
> > > >
> > > > The motivation for this patch is that KASAN cannot use virtually-mapped kernel
> > > > stacks to detect stack overflows. An alternative would be implementing support
> > > > for vmapped stacks in KASAN, but would add significant extra complexity.
> > >
> > > Do we have an idea as to how much additional complexity?
> >
> > We would need to map/unmap shadow for vmalloc region on stack
> > allocation/deallocation. We may need to track shadow pages that cover
> > both stack and an unused memory, or 2 different stacks, which are
> > mapped/unmapped at different times. This may have some concurrency
> > concerns.  Not sure what about page tables for other CPU, I've seen
> > some code that updates pages tables for vmalloc region lazily on page
> > faults. Not sure what about TLBs. Probably also some problems that I
> > can't thought about now.
>
> Ok. So this looks big, we this hasn't been prototyped, so we don't have
> a concrete idea. I agree that concurrency is likely to be painful. :)
>
> [...]
>
> > > > diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
> > > > index 288b065955b7..b218b5713c02 100644
> > > > --- a/arch/x86/include/asm/page_64_types.h
> > > > +++ b/arch/x86/include/asm/page_64_types.h
> > > > @@ -12,8 +12,14 @@
> > > >  #define KASAN_STACK_ORDER 0
> > > >  #endif
> > > >
> > > > +#ifdef CONFIG_STACK_GUARD_PAGE
> > > > +#define STACK_GUARD_SIZE PAGE_SIZE
> > > > +#else
> > > > +#define STACK_GUARD_SIZE 0
> > > > +#endif
> > > > +
> > > >  #define THREAD_SIZE_ORDER    (2 + KASAN_STACK_ORDER)
> > > > -#define THREAD_SIZE  (PAGE_SIZE << THREAD_SIZE_ORDER)
> > > > +#define THREAD_SIZE  ((PAGE_SIZE << THREAD_SIZE_ORDER) - STACK_GUARD_SIZE)
> > >
> > > I'm pretty sure that common code relies on THREAD_SIZE being a
> > > power-of-two. I also know that if we wanted to enable this on arm64 that
> > > would very likely be a requirement.
> > >
> > > For example, in kernel/trace/trace_stack.c we have:
> > >
> > > | this_size = ((unsigned long)stack) & (THREAD_SIZE-1);
> > >
> > > ... and INIT_TASK_DATA() allocates the initial task stack using
> > > THREAD_SIZE, so that may require special care, as it might not be sized
> > > or aligned as you expect.
> >
> > We've built it, booted it, stressed it, everything looked fine... that
> > should have been a build failure.
>
> I think it's been an implicit assumption for so long that no-one saw the need
> for built-time assertions where they depend on it.
>
> I also suspect that in practice there are paths that you won't have
> stressed in your environment, e.g. in the ACPI wakeup path where we end
> up calling:
>
> /* Unpoison the stack for the current task beyond a watermark sp value. */
> asmlinkage void kasan_unpoison_task_stack_below(const void *watermark)
> {
>         /*
>          * Calculate the task stack base address.  Avoid using 'current'
>          * because this function is called by early resume code which hasn't
>          * yet set up the percpu register (%gs).
>          */
>         void *base = (void *)((unsigned long)watermark & ~(THREAD_SIZE - 1));
>
>         kasan_unpoison_shadow(base, watermark - base);
> }
>
> > Is it a property that we need to preserve? Or we could fix the uses
> > that assume power-of-2?
>
> Generally, I think that those can be fixed up. Someone just needs to dig
> through how THREAD_SIZE and THREAD_SIZE_ORDER are used to generate or
> manipulate addresses.
>
> For local-task stuff, I think it's easy to rewrite in terms of
> task_stack_page(), but I'm not entirely sure what we'd do for cases
> where we look at another task, e.g.
>
> static int proc_stack_depth(struct seq_file *m, struct pid_namespace *ns,
>                                 struct pid *pid, struct task_struct *task)
> {
>         unsigned long prev_depth = THREAD_SIZE -
>                                 (task->prev_lowest_stack & (THREAD_SIZE - 1));
>         unsigned long depth = THREAD_SIZE -
>                                 (task->lowest_stack & (THREAD_SIZE - 1));
>
>         seq_printf(m, "previous stack depth: %lu\nstack depth: %lu\n",
>                                                         prev_depth, depth);
>         return 0;
> }
>
> ... as I'm not sure of the lifetime of task->stack relative to task. I
> know that with THREAD_INFO_IN_TASK the stack can be freed while the task
> is still live.
>
> Thanks,
> Mark.

FTR, Daniel just mailed:

[PATCH 0/3] kasan: support backing vmalloc space with real shadow memory
https://groups.google.com/forum/#!topic/kasan-dev/YuwLGJYPB4I
Which presumably will supersede this.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] kernel/fork: Add support for stack-end guard page
  2019-07-25  7:53       ` Dmitry Vyukov
@ 2019-07-25 10:14         ` Mark Rutland
  2019-07-25 15:14           ` Daniel Axtens
  0 siblings, 1 reply; 12+ messages in thread
From: Mark Rutland @ 2019-07-25 10:14 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Marco Elver, LKML, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Andrey Konovalov, Peter Zijlstra, the arch/x86 maintainers,
	kasan-dev, Daniel Axtens

On Thu, Jul 25, 2019 at 09:53:08AM +0200, Dmitry Vyukov wrote:
> On Wed, Jul 24, 2019 at 1:21 PM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Wed, Jul 24, 2019 at 11:11:49AM +0200, Dmitry Vyukov wrote:
> > > On Tue, Jul 23, 2019 at 6:41 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > > >
> > > > On Fri, Jul 19, 2019 at 03:28:17PM +0200, Marco Elver wrote:
> > > > > Enabling STACK_GUARD_PAGE helps catching kernel stack overflows immediately
> > > > > rather than causing difficult-to-diagnose corruption. Note that, unlike
> > > > > virtually-mapped kernel stacks, this will effectively waste an entire page of
> > > > > memory; however, this feature may provide extra protection in cases that cannot
> > > > > use virtually-mapped kernel stacks, at the cost of a page.
> > > > >
> > > > > The motivation for this patch is that KASAN cannot use virtually-mapped kernel
> > > > > stacks to detect stack overflows. An alternative would be implementing support
> > > > > for vmapped stacks in KASAN, but would add significant extra complexity.
> > > >
> > > > Do we have an idea as to how much additional complexity?
> > >
> > > We would need to map/unmap shadow for vmalloc region on stack
> > > allocation/deallocation. We may need to track shadow pages that cover
> > > both stack and an unused memory, or 2 different stacks, which are
> > > mapped/unmapped at different times. This may have some concurrency
> > > concerns.  Not sure what about page tables for other CPU, I've seen
> > > some code that updates pages tables for vmalloc region lazily on page
> > > faults. Not sure what about TLBs. Probably also some problems that I
> > > can't thought about now.
> >
> > Ok. So this looks big, we this hasn't been prototyped, so we don't have
> > a concrete idea. I agree that concurrency is likely to be painful. :)

> FTR, Daniel just mailed:
> 
> [PATCH 0/3] kasan: support backing vmalloc space with real shadow memory
> https://groups.google.com/forum/#!topic/kasan-dev/YuwLGJYPB4I
> Which presumably will supersede this.

Neat!

I'll try to follow that, (and thanks for the Cc there), but I'm not on
any of the lists it went to. IMO it would be nice if subsequent versions
would be Cc'd to LKML, if that's possible. :)

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] kernel/fork: Add support for stack-end guard page
  2019-07-25 10:14         ` Mark Rutland
@ 2019-07-25 15:14           ` Daniel Axtens
  2019-07-25 16:08             ` Mark Rutland
  0 siblings, 1 reply; 12+ messages in thread
From: Daniel Axtens @ 2019-07-25 15:14 UTC (permalink / raw)
  To: Mark Rutland, Dmitry Vyukov
  Cc: Marco Elver, LKML, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, Andrey Ryabinin, Alexander Potapenko,
	Andrey Konovalov, Peter Zijlstra, the arch/x86 maintainers,
	kasan-dev

Mark Rutland <mark.rutland@arm.com> writes:

> On Thu, Jul 25, 2019 at 09:53:08AM +0200, Dmitry Vyukov wrote:
>> On Wed, Jul 24, 2019 at 1:21 PM Mark Rutland <mark.rutland@arm.com> wrote:
>> >
>> > On Wed, Jul 24, 2019 at 11:11:49AM +0200, Dmitry Vyukov wrote:
>> > > On Tue, Jul 23, 2019 at 6:41 PM Mark Rutland <mark.rutland@arm.com> wrote:
>> > > >
>> > > > On Fri, Jul 19, 2019 at 03:28:17PM +0200, Marco Elver wrote:
>> > > > > Enabling STACK_GUARD_PAGE helps catching kernel stack overflows immediately
>> > > > > rather than causing difficult-to-diagnose corruption. Note that, unlike
>> > > > > virtually-mapped kernel stacks, this will effectively waste an entire page of
>> > > > > memory; however, this feature may provide extra protection in cases that cannot
>> > > > > use virtually-mapped kernel stacks, at the cost of a page.
>> > > > >
>> > > > > The motivation for this patch is that KASAN cannot use virtually-mapped kernel
>> > > > > stacks to detect stack overflows. An alternative would be implementing support
>> > > > > for vmapped stacks in KASAN, but would add significant extra complexity.
>> > > >
>> > > > Do we have an idea as to how much additional complexity?
>> > >
>> > > We would need to map/unmap shadow for vmalloc region on stack
>> > > allocation/deallocation. We may need to track shadow pages that cover
>> > > both stack and an unused memory, or 2 different stacks, which are
>> > > mapped/unmapped at different times. This may have some concurrency
>> > > concerns.  Not sure what about page tables for other CPU, I've seen
>> > > some code that updates pages tables for vmalloc region lazily on page
>> > > faults. Not sure what about TLBs. Probably also some problems that I
>> > > can't thought about now.
>> >
>> > Ok. So this looks big, we this hasn't been prototyped, so we don't have
>> > a concrete idea. I agree that concurrency is likely to be painful. :)
>
>> FTR, Daniel just mailed:
>> 
>> [PATCH 0/3] kasan: support backing vmalloc space with real shadow memory
>> https://groups.google.com/forum/#!topic/kasan-dev/YuwLGJYPB4I
>> Which presumably will supersede this.
>
> Neat!
>
> I'll try to follow that, (and thanks for the Cc there), but I'm not on
> any of the lists it went to. IMO it would be nice if subsequent versions
> would be Cc'd to LKML, if that's possible. :)

Will do - apologies for the oversight.

Regards,
Daniel

> Thanks,
> Mark.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] kernel/fork: Add support for stack-end guard page
  2019-07-25 15:14           ` Daniel Axtens
@ 2019-07-25 16:08             ` Mark Rutland
  0 siblings, 0 replies; 12+ messages in thread
From: Mark Rutland @ 2019-07-25 16:08 UTC (permalink / raw)
  To: Daniel Axtens
  Cc: Dmitry Vyukov, Marco Elver, LKML, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, Andrey Ryabinin,
	Alexander Potapenko, Andrey Konovalov, Peter Zijlstra,
	the arch/x86 maintainers, kasan-dev

On Fri, Jul 26, 2019 at 01:14:26AM +1000, Daniel Axtens wrote:
> Mark Rutland <mark.rutland@arm.com> writes:
> > On Thu, Jul 25, 2019 at 09:53:08AM +0200, Dmitry Vyukov wrote:
> >> FTR, Daniel just mailed:
> >> 
> >> [PATCH 0/3] kasan: support backing vmalloc space with real shadow memory
> >> https://groups.google.com/forum/#!topic/kasan-dev/YuwLGJYPB4I
> >> Which presumably will supersede this.
> >
> > Neat!
> >
> > I'll try to follow that, (and thanks for the Cc there), but I'm not on
> > any of the lists it went to. IMO it would be nice if subsequent versions
> > would be Cc'd to LKML, if that's possible. :)
> 
> Will do - apologies for the oversight.

Nothing to apologize for, and thanks in advance!

Mark.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-07-25 16:09 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-19 13:28 [PATCH 1/2] kernel/fork: Add support for stack-end guard page Marco Elver
2019-07-19 13:28 ` [PATCH 2/2] lib/test_kasan: Add stack overflow test Marco Elver
2019-07-23 16:24   ` Mark Rutland
2019-07-23 16:49     ` Marco Elver
2019-07-24 11:23       ` Mark Rutland
2019-07-23 16:41 ` [PATCH 1/2] kernel/fork: Add support for stack-end guard page Mark Rutland
2019-07-24  9:11   ` Dmitry Vyukov
2019-07-24 11:21     ` Mark Rutland
2019-07-25  7:53       ` Dmitry Vyukov
2019-07-25 10:14         ` Mark Rutland
2019-07-25 15:14           ` Daniel Axtens
2019-07-25 16:08             ` Mark Rutland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).