All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] sched: Always check the integrity of the canary
@ 2014-09-04 14:50 ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-04 14:50 UTC (permalink / raw)
  To: peterz
  Cc: mingo, dzickus, bmr, jcastillo, atomlin, oleg, pzijlstr, riel,
	linux-kernel, tglx, x86, rostedt, hannes, aneesh.kumar, akpm,
	linuxppc-dev, minchan

Currently in the event of a stack overrun a call to schedule()
does not check for this type of corruption. This corruption is
often silent and can go unnoticed. However once the corrupted
region is examined at a later stage, the outcome is undefined
and often results in a sporadic page fault which cannot be
handled.

The first patch provides a helper to determine the integrity
of the canary. While the second patch checks for a stack
overrun and takes appropriate action since the damage is
already done, there is no point in continuing.

Aaron Tomlin (2):
  sched: Add helper for task stack page overrun checking
  sched: BUG when stack end location is over written

 arch/powerpc/mm/fault.c    | 6 ++----
 arch/x86/mm/fault.c        | 5 +----
 include/linux/sched.h      | 3 +++
 kernel/sched/core.c        | 3 +++
 kernel/trace/trace_stack.c | 5 ++---
 5 files changed, 11 insertions(+), 11 deletions(-)

-- 
1.9.3


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 0/2] sched: Always check the integrity of the canary
@ 2014-09-04 14:50 ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-04 14:50 UTC (permalink / raw)
  To: peterz
  Cc: dzickus, jcastillo, riel, minchan, bmr, x86, oleg, rostedt,
	linux-kernel, hannes, mingo, aneesh.kumar, atomlin, tglx,
	linuxppc-dev, akpm, pzijlstr

Currently in the event of a stack overrun a call to schedule()
does not check for this type of corruption. This corruption is
often silent and can go unnoticed. However once the corrupted
region is examined at a later stage, the outcome is undefined
and often results in a sporadic page fault which cannot be
handled.

The first patch provides a helper to determine the integrity
of the canary. While the second patch checks for a stack
overrun and takes appropriate action since the damage is
already done, there is no point in continuing.

Aaron Tomlin (2):
  sched: Add helper for task stack page overrun checking
  sched: BUG when stack end location is over written

 arch/powerpc/mm/fault.c    | 6 ++----
 arch/x86/mm/fault.c        | 5 +----
 include/linux/sched.h      | 3 +++
 kernel/sched/core.c        | 3 +++
 kernel/trace/trace_stack.c | 5 ++---
 5 files changed, 11 insertions(+), 11 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 1/2] sched: Add helper for task stack page overrun checking
  2014-09-04 14:50 ` Aaron Tomlin
@ 2014-09-04 14:50   ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-04 14:50 UTC (permalink / raw)
  To: peterz
  Cc: mingo, dzickus, bmr, jcastillo, atomlin, oleg, pzijlstr, riel,
	linux-kernel, tglx, x86, rostedt, hannes, aneesh.kumar, akpm,
	linuxppc-dev, minchan

This facility is used in a few places so let's introduce
a helper function to improve readability.

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 arch/powerpc/mm/fault.c    | 6 ++----
 arch/x86/mm/fault.c        | 5 +----
 include/linux/sched.h      | 3 +++
 kernel/trace/trace_stack.c | 5 ++---
 4 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 51ab9e7..5cffc7c 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -30,7 +30,6 @@
 #include <linux/kprobes.h>
 #include <linux/kdebug.h>
 #include <linux/perf_event.h>
-#include <linux/magic.h>
 #include <linux/ratelimit.h>
 #include <linux/context_tracking.h>
 
@@ -508,7 +507,6 @@ bail:
 void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 {
 	const struct exception_table_entry *entry;
-	unsigned long *stackend;
 
 	/* Are we prepared to handle this fault?  */
 	if ((entry = search_exception_tables(regs->nip)) != NULL) {
@@ -537,8 +535,8 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 	printk(KERN_ALERT "Faulting instruction address: 0x%08lx\n",
 		regs->nip);
 
-	stackend = end_of_stack(current);
-	if (current != &init_task && *stackend != STACK_END_MAGIC)
+	if (current != &init_task &&
+		task_stack_end_corrupted(current))
 		printk(KERN_ALERT "Thread overran stack, or stack corrupted\n");
 
 	die("Kernel access of bad area", regs, sig);
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index a241946..b5b9c09 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -3,7 +3,6 @@
  *  Copyright (C) 2001, 2002 Andi Kleen, SuSE Labs.
  *  Copyright (C) 2008-2009, Red Hat Inc., Ingo Molnar
  */
-#include <linux/magic.h>		/* STACK_END_MAGIC		*/
 #include <linux/sched.h>		/* test_thread_flag(), ...	*/
 #include <linux/kdebug.h>		/* oops_begin/end, ...		*/
 #include <linux/module.h>		/* search_exception_table	*/
@@ -649,7 +648,6 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 	   unsigned long address, int signal, int si_code)
 {
 	struct task_struct *tsk = current;
-	unsigned long *stackend;
 	unsigned long flags;
 	int sig;
 
@@ -709,8 +707,7 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 
 	show_fault_oops(regs, error_code, address);
 
-	stackend = end_of_stack(tsk);
-	if (tsk != &init_task && *stackend != STACK_END_MAGIC)
+	if (tsk != &init_task && task_stack_end_corrupted(tsk))
 		printk(KERN_EMERG "Thread overran stack, or stack corrupted\n");
 
 	tsk->thread.cr2		= address;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5c2c885..36d93aa 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -57,6 +57,7 @@ struct sched_param {
 #include <linux/llist.h>
 #include <linux/uidgid.h>
 #include <linux/gfp.h>
+#include <linux/magic.h>
 
 #include <asm/processor.h>
 
@@ -2614,6 +2615,8 @@ static inline unsigned long *end_of_stack(struct task_struct *p)
 }
 
 #endif
+#define task_stack_end_corrupted(task) \
+		(*(end_of_stack(task)) != STACK_END_MAGIC)
 
 static inline int object_is_on_stack(void *obj)
 {
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 8a4e5cb..06c7390 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -13,7 +13,6 @@
 #include <linux/sysctl.h>
 #include <linux/init.h>
 #include <linux/fs.h>
-#include <linux/magic.h>
 
 #include <asm/setup.h>
 
@@ -171,8 +170,8 @@ check_stack(unsigned long ip, unsigned long *stack)
 			i++;
 	}
 
-	if ((current != &init_task &&
-		*(end_of_stack(current)) != STACK_END_MAGIC)) {
+	if (current != &init_task &&
+		task_stack_end_corrupted(current)) {
 		print_max_stack();
 		BUG();
 	}
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 1/2] sched: Add helper for task stack page overrun checking
@ 2014-09-04 14:50   ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-04 14:50 UTC (permalink / raw)
  To: peterz
  Cc: dzickus, jcastillo, riel, minchan, bmr, x86, oleg, rostedt,
	linux-kernel, hannes, mingo, aneesh.kumar, atomlin, tglx,
	linuxppc-dev, akpm, pzijlstr

This facility is used in a few places so let's introduce
a helper function to improve readability.

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 arch/powerpc/mm/fault.c    | 6 ++----
 arch/x86/mm/fault.c        | 5 +----
 include/linux/sched.h      | 3 +++
 kernel/trace/trace_stack.c | 5 ++---
 4 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 51ab9e7..5cffc7c 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -30,7 +30,6 @@
 #include <linux/kprobes.h>
 #include <linux/kdebug.h>
 #include <linux/perf_event.h>
-#include <linux/magic.h>
 #include <linux/ratelimit.h>
 #include <linux/context_tracking.h>
 
@@ -508,7 +507,6 @@ bail:
 void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 {
 	const struct exception_table_entry *entry;
-	unsigned long *stackend;
 
 	/* Are we prepared to handle this fault?  */
 	if ((entry = search_exception_tables(regs->nip)) != NULL) {
@@ -537,8 +535,8 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 	printk(KERN_ALERT "Faulting instruction address: 0x%08lx\n",
 		regs->nip);
 
-	stackend = end_of_stack(current);
-	if (current != &init_task && *stackend != STACK_END_MAGIC)
+	if (current != &init_task &&
+		task_stack_end_corrupted(current))
 		printk(KERN_ALERT "Thread overran stack, or stack corrupted\n");
 
 	die("Kernel access of bad area", regs, sig);
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index a241946..b5b9c09 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -3,7 +3,6 @@
  *  Copyright (C) 2001, 2002 Andi Kleen, SuSE Labs.
  *  Copyright (C) 2008-2009, Red Hat Inc., Ingo Molnar
  */
-#include <linux/magic.h>		/* STACK_END_MAGIC		*/
 #include <linux/sched.h>		/* test_thread_flag(), ...	*/
 #include <linux/kdebug.h>		/* oops_begin/end, ...		*/
 #include <linux/module.h>		/* search_exception_table	*/
@@ -649,7 +648,6 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 	   unsigned long address, int signal, int si_code)
 {
 	struct task_struct *tsk = current;
-	unsigned long *stackend;
 	unsigned long flags;
 	int sig;
 
@@ -709,8 +707,7 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 
 	show_fault_oops(regs, error_code, address);
 
-	stackend = end_of_stack(tsk);
-	if (tsk != &init_task && *stackend != STACK_END_MAGIC)
+	if (tsk != &init_task && task_stack_end_corrupted(tsk))
 		printk(KERN_EMERG "Thread overran stack, or stack corrupted\n");
 
 	tsk->thread.cr2		= address;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5c2c885..36d93aa 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -57,6 +57,7 @@ struct sched_param {
 #include <linux/llist.h>
 #include <linux/uidgid.h>
 #include <linux/gfp.h>
+#include <linux/magic.h>
 
 #include <asm/processor.h>
 
@@ -2614,6 +2615,8 @@ static inline unsigned long *end_of_stack(struct task_struct *p)
 }
 
 #endif
+#define task_stack_end_corrupted(task) \
+		(*(end_of_stack(task)) != STACK_END_MAGIC)
 
 static inline int object_is_on_stack(void *obj)
 {
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 8a4e5cb..06c7390 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -13,7 +13,6 @@
 #include <linux/sysctl.h>
 #include <linux/init.h>
 #include <linux/fs.h>
-#include <linux/magic.h>
 
 #include <asm/setup.h>
 
@@ -171,8 +170,8 @@ check_stack(unsigned long ip, unsigned long *stack)
 			i++;
 	}
 
-	if ((current != &init_task &&
-		*(end_of_stack(current)) != STACK_END_MAGIC)) {
+	if (current != &init_task &&
+		task_stack_end_corrupted(current)) {
 		print_max_stack();
 		BUG();
 	}
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 2/2] sched: BUG when stack end location is over written
  2014-09-04 14:50 ` Aaron Tomlin
@ 2014-09-04 14:50   ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-04 14:50 UTC (permalink / raw)
  To: peterz
  Cc: mingo, dzickus, bmr, jcastillo, atomlin, oleg, pzijlstr, riel,
	linux-kernel, tglx, x86, rostedt, hannes, aneesh.kumar, akpm,
	linuxppc-dev, minchan

Currently in the event of a stack overrun a call to schedule()
does not check for this type of corruption. This corruption is
often silent and can go unnoticed. However once the corrupted
region is examined at a later stage, the outcome is undefined
and often results in a sporadic page fault which cannot be
handled.

This patch checks for a stack overrun and takes appropriate
action since the damage is already done, there is no point
in continuing.

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 kernel/sched/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ec1a286..d6af6a0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2660,6 +2660,9 @@ static noinline void __schedule_bug(struct task_struct *prev)
  */
 static inline void schedule_debug(struct task_struct *prev)
 {
+	if (unlikely(prev != &init_task &&
+		task_stack_end_corrupted(prev)))
+		BUG();
 	/*
 	 * Test if we are atomic. Since do_exit() needs to call into
 	 * schedule() atomically, we ignore that path. Otherwise whine
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 2/2] sched: BUG when stack end location is over written
@ 2014-09-04 14:50   ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-04 14:50 UTC (permalink / raw)
  To: peterz
  Cc: dzickus, jcastillo, riel, minchan, bmr, x86, oleg, rostedt,
	linux-kernel, hannes, mingo, aneesh.kumar, atomlin, tglx,
	linuxppc-dev, akpm, pzijlstr

Currently in the event of a stack overrun a call to schedule()
does not check for this type of corruption. This corruption is
often silent and can go unnoticed. However once the corrupted
region is examined at a later stage, the outcome is undefined
and often results in a sporadic page fault which cannot be
handled.

This patch checks for a stack overrun and takes appropriate
action since the damage is already done, there is no point
in continuing.

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 kernel/sched/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ec1a286..d6af6a0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2660,6 +2660,9 @@ static noinline void __schedule_bug(struct task_struct *prev)
  */
 static inline void schedule_debug(struct task_struct *prev)
 {
+	if (unlikely(prev != &init_task &&
+		task_stack_end_corrupted(prev)))
+		BUG();
 	/*
 	 * Test if we are atomic. Since do_exit() needs to call into
 	 * schedule() atomically, we ignore that path. Otherwise whine
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH 1/2] sched: Add helper for task stack page overrun checking
  2014-09-04 14:50   ` Aaron Tomlin
@ 2014-09-04 15:02     ` Oleg Nesterov
  -1 siblings, 0 replies; 74+ messages in thread
From: Oleg Nesterov @ 2014-09-04 15:02 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: peterz, mingo, dzickus, bmr, jcastillo, pzijlstr, riel,
	linux-kernel, tglx, x86, rostedt, hannes, aneesh.kumar, akpm,
	linuxppc-dev, minchan

On 09/04, Aaron Tomlin wrote:
>
> +#define task_stack_end_corrupted(task) \
> +		(*(end_of_stack(task)) != STACK_END_MAGIC)

and it is always used along with "tsk != init_task".

But why we exclude swapper/0? Can't we add

	end_of_stack(current) = STACK_END_MAGIC;

somewhere at the start of start_kernel() ?

If not, perhaps this new helper should check "task != &init_task"
itself so that we can simplify its users?

Oleg.

>  
>  static inline int object_is_on_stack(void *obj)
>  {
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index 8a4e5cb..06c7390 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -13,7 +13,6 @@
>  #include <linux/sysctl.h>
>  #include <linux/init.h>
>  #include <linux/fs.h>
> -#include <linux/magic.h>
>  
>  #include <asm/setup.h>
>  
> @@ -171,8 +170,8 @@ check_stack(unsigned long ip, unsigned long *stack)
>  			i++;
>  	}
>  
> -	if ((current != &init_task &&
> -		*(end_of_stack(current)) != STACK_END_MAGIC)) {
> +	if (current != &init_task &&
> +		task_stack_end_corrupted(current)) {
>  		print_max_stack();
>  		BUG();
>  	}
> -- 
> 1.9.3
> 


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 1/2] sched: Add helper for task stack page overrun checking
@ 2014-09-04 15:02     ` Oleg Nesterov
  0 siblings, 0 replies; 74+ messages in thread
From: Oleg Nesterov @ 2014-09-04 15:02 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: dzickus, pzijlstr, riel, peterz, bmr, x86, linux-kernel, rostedt,
	minchan, mingo, aneesh.kumar, hannes, tglx, linuxppc-dev, akpm,
	jcastillo

On 09/04, Aaron Tomlin wrote:
>
> +#define task_stack_end_corrupted(task) \
> +		(*(end_of_stack(task)) != STACK_END_MAGIC)

and it is always used along with "tsk != init_task".

But why we exclude swapper/0? Can't we add

	end_of_stack(current) = STACK_END_MAGIC;

somewhere at the start of start_kernel() ?

If not, perhaps this new helper should check "task != &init_task"
itself so that we can simplify its users?

Oleg.

>  
>  static inline int object_is_on_stack(void *obj)
>  {
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index 8a4e5cb..06c7390 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -13,7 +13,6 @@
>  #include <linux/sysctl.h>
>  #include <linux/init.h>
>  #include <linux/fs.h>
> -#include <linux/magic.h>
>  
>  #include <asm/setup.h>
>  
> @@ -171,8 +170,8 @@ check_stack(unsigned long ip, unsigned long *stack)
>  			i++;
>  	}
>  
> -	if ((current != &init_task &&
> -		*(end_of_stack(current)) != STACK_END_MAGIC)) {
> +	if (current != &init_task &&
> +		task_stack_end_corrupted(current)) {
>  		print_max_stack();
>  		BUG();
>  	}
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 1/2] sched: Add helper for task stack page overrun checking
  2014-09-04 14:50   ` Aaron Tomlin
@ 2014-09-04 15:30     ` Peter Zijlstra
  -1 siblings, 0 replies; 74+ messages in thread
From: Peter Zijlstra @ 2014-09-04 15:30 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: mingo, dzickus, bmr, jcastillo, oleg, pzijlstr, riel,
	linux-kernel, tglx, x86, rostedt, hannes, aneesh.kumar, akpm,
	linuxppc-dev, minchan

On Thu, Sep 04, 2014 at 03:50:23PM +0100, Aaron Tomlin wrote:

> @@ -537,8 +535,8 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
>  	printk(KERN_ALERT "Faulting instruction address: 0x%08lx\n",
>  		regs->nip);
>  
> -	stackend = end_of_stack(current);
> -	if (current != &init_task && *stackend != STACK_END_MAGIC)
> +	if (current != &init_task &&
> +		task_stack_end_corrupted(current))

superfluous linebreak.

> @@ -2614,6 +2615,8 @@ static inline unsigned long *end_of_stack(struct task_struct *p)
>  }
>  
>  #endif

Here otoh we could do with some extra whitespace

> @@ -171,8 +170,8 @@ check_stack(unsigned long ip, unsigned long *stack)
>  			i++;
>  	}
>  
> -	if ((current != &init_task &&
> -		*(end_of_stack(current)) != STACK_END_MAGIC)) {
> +	if (current != &init_task &&
> +		task_stack_end_corrupted(current)) {

Again, superfluous linebreak.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 1/2] sched: Add helper for task stack page overrun checking
@ 2014-09-04 15:30     ` Peter Zijlstra
  0 siblings, 0 replies; 74+ messages in thread
From: Peter Zijlstra @ 2014-09-04 15:30 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: dzickus, jcastillo, riel, bmr, x86, oleg, rostedt, linux-kernel,
	minchan, mingo, aneesh.kumar, hannes, tglx, linuxppc-dev, akpm,
	pzijlstr

On Thu, Sep 04, 2014 at 03:50:23PM +0100, Aaron Tomlin wrote:

> @@ -537,8 +535,8 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
>  	printk(KERN_ALERT "Faulting instruction address: 0x%08lx\n",
>  		regs->nip);
>  
> -	stackend = end_of_stack(current);
> -	if (current != &init_task && *stackend != STACK_END_MAGIC)
> +	if (current != &init_task &&
> +		task_stack_end_corrupted(current))

superfluous linebreak.

> @@ -2614,6 +2615,8 @@ static inline unsigned long *end_of_stack(struct task_struct *p)
>  }
>  
>  #endif

Here otoh we could do with some extra whitespace

> @@ -171,8 +170,8 @@ check_stack(unsigned long ip, unsigned long *stack)
>  			i++;
>  	}
>  
> -	if ((current != &init_task &&
> -		*(end_of_stack(current)) != STACK_END_MAGIC)) {
> +	if (current != &init_task &&
> +		task_stack_end_corrupted(current)) {

Again, superfluous linebreak.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/2] sched: BUG when stack end location is over written
  2014-09-04 14:50   ` Aaron Tomlin
@ 2014-09-04 15:32     ` Peter Zijlstra
  -1 siblings, 0 replies; 74+ messages in thread
From: Peter Zijlstra @ 2014-09-04 15:32 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: mingo, dzickus, bmr, jcastillo, oleg, pzijlstr, riel,
	linux-kernel, tglx, x86, rostedt, hannes, aneesh.kumar, akpm,
	linuxppc-dev, minchan

On Thu, Sep 04, 2014 at 03:50:24PM +0100, Aaron Tomlin wrote:
> Currently in the event of a stack overrun a call to schedule()
> does not check for this type of corruption. This corruption is
> often silent and can go unnoticed. However once the corrupted
> region is examined at a later stage, the outcome is undefined
> and often results in a sporadic page fault which cannot be
> handled.
> 
> This patch checks for a stack overrun and takes appropriate
> action since the damage is already done, there is no point
> in continuing.
> 
> Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
> ---
>  kernel/sched/core.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index ec1a286..d6af6a0 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2660,6 +2660,9 @@ static noinline void __schedule_bug(struct task_struct *prev)
>   */
>  static inline void schedule_debug(struct task_struct *prev)
>  {
> +	if (unlikely(prev != &init_task &&
> +		task_stack_end_corrupted(prev)))
> +		BUG();

superfluous linebreak, also we appear to have BUG_ON() for situations
just like these.

secondly, while I appreciate the 'feature' you're making schedule()
slower for everybody, what do you propose to do about that?

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/2] sched: BUG when stack end location is over written
@ 2014-09-04 15:32     ` Peter Zijlstra
  0 siblings, 0 replies; 74+ messages in thread
From: Peter Zijlstra @ 2014-09-04 15:32 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: dzickus, jcastillo, riel, bmr, x86, oleg, rostedt, linux-kernel,
	minchan, mingo, aneesh.kumar, hannes, tglx, linuxppc-dev, akpm,
	pzijlstr

On Thu, Sep 04, 2014 at 03:50:24PM +0100, Aaron Tomlin wrote:
> Currently in the event of a stack overrun a call to schedule()
> does not check for this type of corruption. This corruption is
> often silent and can go unnoticed. However once the corrupted
> region is examined at a later stage, the outcome is undefined
> and often results in a sporadic page fault which cannot be
> handled.
> 
> This patch checks for a stack overrun and takes appropriate
> action since the damage is already done, there is no point
> in continuing.
> 
> Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
> ---
>  kernel/sched/core.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index ec1a286..d6af6a0 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2660,6 +2660,9 @@ static noinline void __schedule_bug(struct task_struct *prev)
>   */
>  static inline void schedule_debug(struct task_struct *prev)
>  {
> +	if (unlikely(prev != &init_task &&
> +		task_stack_end_corrupted(prev)))
> +		BUG();

superfluous linebreak, also we appear to have BUG_ON() for situations
just like these.

secondly, while I appreciate the 'feature' you're making schedule()
slower for everybody, what do you propose to do about that?

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 1/2] sched: Add helper for task stack page overrun checking
  2014-09-04 15:02     ` Oleg Nesterov
@ 2014-09-04 15:52       ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-04 15:52 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: peterz, mingo, dzickus, bmr, jcastillo, pzijlstr, riel,
	linux-kernel, tglx, x86, rostedt, hannes, aneesh.kumar, akpm,
	linuxppc-dev, minchan

On Thu, Sep 04, 2014 at 05:02:34PM +0200, Oleg Nesterov wrote:
> On 09/04, Aaron Tomlin wrote:
> >
> > +#define task_stack_end_corrupted(task) \
> > +		(*(end_of_stack(task)) != STACK_END_MAGIC)
> 
> and it is always used along with "tsk != init_task".
> 
> But why we exclude swapper/0? Can't we add
> 
> 	end_of_stack(current) = STACK_END_MAGIC;
> 
> somewhere at the start of start_kernel() ?

Good point. I can look into it.

> If not, perhaps this new helper should check "task != &init_task"
> itself so that we can simplify its users?
> 
> Oleg.
> 
> >  
> >  static inline int object_is_on_stack(void *obj)
> >  {
> > diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> > index 8a4e5cb..06c7390 100644
> > --- a/kernel/trace/trace_stack.c
> > +++ b/kernel/trace/trace_stack.c
> > @@ -13,7 +13,6 @@
> >  #include <linux/sysctl.h>
> >  #include <linux/init.h>
> >  #include <linux/fs.h>
> > -#include <linux/magic.h>
> >  
> >  #include <asm/setup.h>
> >  
> > @@ -171,8 +170,8 @@ check_stack(unsigned long ip, unsigned long *stack)
> >  			i++;
> >  	}
> >  
> > -	if ((current != &init_task &&
> > -		*(end_of_stack(current)) != STACK_END_MAGIC)) {
> > +	if (current != &init_task &&
> > +		task_stack_end_corrupted(current)) {
> >  		print_max_stack();
> >  		BUG();
> >  	}
> > -- 
> > 1.9.3
> > 
> 

-- 
Aaron Tomlin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 1/2] sched: Add helper for task stack page overrun checking
@ 2014-09-04 15:52       ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-04 15:52 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: dzickus, pzijlstr, riel, peterz, bmr, x86, linux-kernel, rostedt,
	minchan, mingo, aneesh.kumar, hannes, tglx, linuxppc-dev, akpm,
	jcastillo

On Thu, Sep 04, 2014 at 05:02:34PM +0200, Oleg Nesterov wrote:
> On 09/04, Aaron Tomlin wrote:
> >
> > +#define task_stack_end_corrupted(task) \
> > +		(*(end_of_stack(task)) != STACK_END_MAGIC)
> 
> and it is always used along with "tsk != init_task".
> 
> But why we exclude swapper/0? Can't we add
> 
> 	end_of_stack(current) = STACK_END_MAGIC;
> 
> somewhere at the start of start_kernel() ?

Good point. I can look into it.

> If not, perhaps this new helper should check "task != &init_task"
> itself so that we can simplify its users?
> 
> Oleg.
> 
> >  
> >  static inline int object_is_on_stack(void *obj)
> >  {
> > diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> > index 8a4e5cb..06c7390 100644
> > --- a/kernel/trace/trace_stack.c
> > +++ b/kernel/trace/trace_stack.c
> > @@ -13,7 +13,6 @@
> >  #include <linux/sysctl.h>
> >  #include <linux/init.h>
> >  #include <linux/fs.h>
> > -#include <linux/magic.h>
> >  
> >  #include <asm/setup.h>
> >  
> > @@ -171,8 +170,8 @@ check_stack(unsigned long ip, unsigned long *stack)
> >  			i++;
> >  	}
> >  
> > -	if ((current != &init_task &&
> > -		*(end_of_stack(current)) != STACK_END_MAGIC)) {
> > +	if (current != &init_task &&
> > +		task_stack_end_corrupted(current)) {
> >  		print_max_stack();
> >  		BUG();
> >  	}
> > -- 
> > 1.9.3
> > 
> 

-- 
Aaron Tomlin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/2] sched: BUG when stack end location is over written
  2014-09-04 15:32     ` Peter Zijlstra
@ 2014-09-04 16:11       ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-04 16:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mingo, dzickus, bmr, jcastillo, oleg, pzijlstr, riel,
	linux-kernel, tglx, x86, rostedt, hannes, aneesh.kumar, akpm,
	linuxppc-dev, minchan

On Thu, Sep 04, 2014 at 05:32:31PM +0200, Peter Zijlstra wrote:
> On Thu, Sep 04, 2014 at 03:50:24PM +0100, Aaron Tomlin wrote:
> > Currently in the event of a stack overrun a call to schedule()
> > does not check for this type of corruption. This corruption is
> > often silent and can go unnoticed. However once the corrupted
> > region is examined at a later stage, the outcome is undefined
> > and often results in a sporadic page fault which cannot be
> > handled.
> > 
> > This patch checks for a stack overrun and takes appropriate
> > action since the damage is already done, there is no point
> > in continuing.
> > 
> > Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
> > ---
> >  kernel/sched/core.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index ec1a286..d6af6a0 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -2660,6 +2660,9 @@ static noinline void __schedule_bug(struct task_struct *prev)
> >   */
> >  static inline void schedule_debug(struct task_struct *prev)
> >  {
> > +	if (unlikely(prev != &init_task &&
> > +		task_stack_end_corrupted(prev)))
> > +		BUG();
> 
> superfluous linebreak, also we appear to have BUG_ON() for situations
> just like these.
> 
> secondly, while I appreciate the 'feature' you're making schedule()
> slower for everybody, what do you propose to do about that?

Understood. I will wrap this with a suitable Kconfig option.

-- 
Aaron Tomlin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH 2/2] sched: BUG when stack end location is over written
@ 2014-09-04 16:11       ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-04 16:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: dzickus, jcastillo, riel, bmr, x86, oleg, rostedt, linux-kernel,
	minchan, mingo, aneesh.kumar, akpm, hannes, tglx, linuxppc-dev,
	pzijlstr

On Thu, Sep 04, 2014 at 05:32:31PM +0200, Peter Zijlstra wrote:
> On Thu, Sep 04, 2014 at 03:50:24PM +0100, Aaron Tomlin wrote:
> > Currently in the event of a stack overrun a call to schedule()
> > does not check for this type of corruption. This corruption is
> > often silent and can go unnoticed. However once the corrupted
> > region is examined at a later stage, the outcome is undefined
> > and often results in a sporadic page fault which cannot be
> > handled.
> > 
> > This patch checks for a stack overrun and takes appropriate
> > action since the damage is already done, there is no point
> > in continuing.
> > 
> > Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
> > ---
> >  kernel/sched/core.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index ec1a286..d6af6a0 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -2660,6 +2660,9 @@ static noinline void __schedule_bug(struct task_struct *prev)
> >   */
> >  static inline void schedule_debug(struct task_struct *prev)
> >  {
> > +	if (unlikely(prev != &init_task &&
> > +		task_stack_end_corrupted(prev)))
> > +		BUG();
> 
> superfluous linebreak, also we appear to have BUG_ON() for situations
> just like these.
> 
> secondly, while I appreciate the 'feature' you're making schedule()
> slower for everybody, what do you propose to do about that?

Understood. I will wrap this with a suitable Kconfig option.

-- 
Aaron Tomlin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v2 0/3] sched: Always check the integrity of the canary
  2014-09-04 16:11       ` Aaron Tomlin
@ 2014-09-08 19:23         ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-08 19:23 UTC (permalink / raw)
  To: peterz
  Cc: mingo, dzickus, bmr, jcastillo, atomlin, oleg, pzijlstr, riel,
	linux-kernel, tglx, x86, rostedt, hannes, aneesh.kumar, akpm,
	akpm, linuxppc-dev, minchan

Currently in the event of a stack overrun a call to schedule()
does not check for this type of corruption. This corruption is
often silent and can go unnoticed. However once the corrupted
region is examined at a later stage, the outcome is undefined
and often results in a sporadic page fault which cannot be
handled.

The first patch adds a canary to init_task's end of stack.
While the second patch provides a helper to determine the
integrity of the canary. The third checks for a stack
overrun and takes appropriate action since the damage
is already done, there is no point in continuing.

Changes since v1:

 * Rebased against v3.17-rc4
 * Add a canary to init_task - Oleg Nesterov
 * Fix various code formatting issues - Peter Zijlstra
 * Introduce Kconfig option - Peter Zijlstra

Aaron Tomlin (3):
  init/main.c: Give init_task a canary
  sched: Add helper for task stack page overrun checking
  sched: BUG when stack end location is over written

 arch/powerpc/mm/fault.c    |  5 +----
 arch/x86/mm/fault.c        |  5 +----
 include/linux/sched.h      |  4 ++++
 init/main.c                |  1 +
 kernel/fork.c              | 12 +++++++++---
 kernel/sched/core.c        |  4 ++++
 kernel/trace/trace_stack.c |  4 +---
 lib/Kconfig.debug          | 12 ++++++++++++
 8 files changed, 33 insertions(+), 14 deletions(-)

-- 
1.9.3


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v2 0/3] sched: Always check the integrity of the canary
@ 2014-09-08 19:23         ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-08 19:23 UTC (permalink / raw)
  To: peterz
  Cc: dzickus, jcastillo, riel, minchan, bmr, x86, oleg, rostedt,
	linux-kernel, hannes, mingo, aneesh.kumar, akpm, atomlin, tglx,
	linuxppc-dev, akpm, pzijlstr

Currently in the event of a stack overrun a call to schedule()
does not check for this type of corruption. This corruption is
often silent and can go unnoticed. However once the corrupted
region is examined at a later stage, the outcome is undefined
and often results in a sporadic page fault which cannot be
handled.

The first patch adds a canary to init_task's end of stack.
While the second patch provides a helper to determine the
integrity of the canary. The third checks for a stack
overrun and takes appropriate action since the damage
is already done, there is no point in continuing.

Changes since v1:

 * Rebased against v3.17-rc4
 * Add a canary to init_task - Oleg Nesterov
 * Fix various code formatting issues - Peter Zijlstra
 * Introduce Kconfig option - Peter Zijlstra

Aaron Tomlin (3):
  init/main.c: Give init_task a canary
  sched: Add helper for task stack page overrun checking
  sched: BUG when stack end location is over written

 arch/powerpc/mm/fault.c    |  5 +----
 arch/x86/mm/fault.c        |  5 +----
 include/linux/sched.h      |  4 ++++
 init/main.c                |  1 +
 kernel/fork.c              | 12 +++++++++---
 kernel/sched/core.c        |  4 ++++
 kernel/trace/trace_stack.c |  4 +---
 lib/Kconfig.debug          | 12 ++++++++++++
 8 files changed, 33 insertions(+), 14 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH 1/3] init/main.c: Give init_task a canary
  2014-09-08 19:23         ` Aaron Tomlin
@ 2014-09-08 19:23           ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-08 19:23 UTC (permalink / raw)
  To: peterz
  Cc: mingo, dzickus, bmr, jcastillo, atomlin, oleg, pzijlstr, riel,
	linux-kernel, tglx, x86, rostedt, hannes, aneesh.kumar, akpm,
	akpm, linuxppc-dev, minchan

Tasks get their end of stack set to STACK_END_MAGIC with the
aim to catch stack overruns. Currently this feature does not
apply to init_task. This patch removes this restriction.

Note that a similar patch was posted by Prarit Bhargava [1]
some time ago but was never merged.

[1]: http://marc.info/?l=linux-kernel&m=127144305403241&w=2

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 arch/powerpc/mm/fault.c    |  3 +--
 arch/x86/mm/fault.c        |  3 +--
 include/linux/sched.h      |  2 ++
 init/main.c                |  1 +
 kernel/fork.c              | 12 +++++++++---
 kernel/trace/trace_stack.c |  4 +---
 6 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 51ab9e7..35d0760c 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -30,7 +30,6 @@
 #include <linux/kprobes.h>
 #include <linux/kdebug.h>
 #include <linux/perf_event.h>
-#include <linux/magic.h>
 #include <linux/ratelimit.h>
 #include <linux/context_tracking.h>
 
@@ -538,7 +537,7 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 		regs->nip);
 
 	stackend = end_of_stack(current);
-	if (current != &init_task && *stackend != STACK_END_MAGIC)
+	if (*stackend != STACK_END_MAGIC)
 		printk(KERN_ALERT "Thread overran stack, or stack corrupted\n");
 
 	die("Kernel access of bad area", regs, sig);
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index a241946..bc23a70 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -3,7 +3,6 @@
  *  Copyright (C) 2001, 2002 Andi Kleen, SuSE Labs.
  *  Copyright (C) 2008-2009, Red Hat Inc., Ingo Molnar
  */
-#include <linux/magic.h>		/* STACK_END_MAGIC		*/
 #include <linux/sched.h>		/* test_thread_flag(), ...	*/
 #include <linux/kdebug.h>		/* oops_begin/end, ...		*/
 #include <linux/module.h>		/* search_exception_table	*/
@@ -710,7 +709,7 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 	show_fault_oops(regs, error_code, address);
 
 	stackend = end_of_stack(tsk);
-	if (tsk != &init_task && *stackend != STACK_END_MAGIC)
+	if (*stackend != STACK_END_MAGIC)
 		printk(KERN_EMERG "Thread overran stack, or stack corrupted\n");
 
 	tsk->thread.cr2		= address;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5c2c885..4dee9d7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -57,6 +57,7 @@ struct sched_param {
 #include <linux/llist.h>
 #include <linux/uidgid.h>
 #include <linux/gfp.h>
+#include <linux/magic.h>
 
 #include <asm/processor.h>
 
@@ -2636,6 +2637,7 @@ static inline unsigned long stack_not_used(struct task_struct *p)
 	return (unsigned long)n - (unsigned long)end_of_stack(p);
 }
 #endif
+extern void task_stack_end_magic(struct task_struct *tsk);
 
 /* set thread flags in other task's structures
  * - see asm/thread_info.h for TIF_xxxx flags available
diff --git a/init/main.c b/init/main.c
index bb1aed9..bcdabf3a 100644
--- a/init/main.c
+++ b/init/main.c
@@ -508,6 +508,7 @@ asmlinkage __visible void __init start_kernel(void)
 	 * lockdep hash:
 	 */
 	lockdep_init();
+	task_stack_end_magic(&init_task);
 	smp_setup_processor_id();
 	debug_objects_early_init();
 
diff --git a/kernel/fork.c b/kernel/fork.c
index 0cf9cdb..67cb1e3 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -294,11 +294,18 @@ int __weak arch_dup_task_struct(struct task_struct *dst,
 	return 0;
 }
 
+void task_stack_end_magic(struct task_struct *tsk)
+{
+	unsigned long *stackend;
+
+	stackend = end_of_stack(tsk);
+	*stackend = STACK_END_MAGIC;	/* for overflow detection */
+}
+
 static struct task_struct *dup_task_struct(struct task_struct *orig)
 {
 	struct task_struct *tsk;
 	struct thread_info *ti;
-	unsigned long *stackend;
 	int node = tsk_fork_get_node(orig);
 	int err;
 
@@ -328,8 +335,7 @@ static struct task_struct *dup_task_struct(struct task_struct *orig)
 	setup_thread_stack(tsk, orig);
 	clear_user_return_notifier(tsk);
 	clear_tsk_need_resched(tsk);
-	stackend = end_of_stack(tsk);
-	*stackend = STACK_END_MAGIC;	/* for overflow detection */
+	task_stack_end_magic(tsk);
 
 #ifdef CONFIG_CC_STACKPROTECTOR
 	tsk->stack_canary = get_random_int();
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 8a4e5cb..1636e41 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -13,7 +13,6 @@
 #include <linux/sysctl.h>
 #include <linux/init.h>
 #include <linux/fs.h>
-#include <linux/magic.h>
 
 #include <asm/setup.h>
 
@@ -171,8 +170,7 @@ check_stack(unsigned long ip, unsigned long *stack)
 			i++;
 	}
 
-	if ((current != &init_task &&
-		*(end_of_stack(current)) != STACK_END_MAGIC)) {
+	if (*end_of_stack(current) != STACK_END_MAGIC) {
 		print_max_stack();
 		BUG();
 	}
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 1/3] init/main.c: Give init_task a canary
@ 2014-09-08 19:23           ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-08 19:23 UTC (permalink / raw)
  To: peterz
  Cc: dzickus, jcastillo, riel, minchan, bmr, x86, oleg, rostedt,
	linux-kernel, hannes, mingo, aneesh.kumar, akpm, atomlin, tglx,
	linuxppc-dev, akpm, pzijlstr

Tasks get their end of stack set to STACK_END_MAGIC with the
aim to catch stack overruns. Currently this feature does not
apply to init_task. This patch removes this restriction.

Note that a similar patch was posted by Prarit Bhargava [1]
some time ago but was never merged.

[1]: http://marc.info/?l=linux-kernel&m=127144305403241&w=2

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 arch/powerpc/mm/fault.c    |  3 +--
 arch/x86/mm/fault.c        |  3 +--
 include/linux/sched.h      |  2 ++
 init/main.c                |  1 +
 kernel/fork.c              | 12 +++++++++---
 kernel/trace/trace_stack.c |  4 +---
 6 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 51ab9e7..35d0760c 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -30,7 +30,6 @@
 #include <linux/kprobes.h>
 #include <linux/kdebug.h>
 #include <linux/perf_event.h>
-#include <linux/magic.h>
 #include <linux/ratelimit.h>
 #include <linux/context_tracking.h>
 
@@ -538,7 +537,7 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 		regs->nip);
 
 	stackend = end_of_stack(current);
-	if (current != &init_task && *stackend != STACK_END_MAGIC)
+	if (*stackend != STACK_END_MAGIC)
 		printk(KERN_ALERT "Thread overran stack, or stack corrupted\n");
 
 	die("Kernel access of bad area", regs, sig);
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index a241946..bc23a70 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -3,7 +3,6 @@
  *  Copyright (C) 2001, 2002 Andi Kleen, SuSE Labs.
  *  Copyright (C) 2008-2009, Red Hat Inc., Ingo Molnar
  */
-#include <linux/magic.h>		/* STACK_END_MAGIC		*/
 #include <linux/sched.h>		/* test_thread_flag(), ...	*/
 #include <linux/kdebug.h>		/* oops_begin/end, ...		*/
 #include <linux/module.h>		/* search_exception_table	*/
@@ -710,7 +709,7 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 	show_fault_oops(regs, error_code, address);
 
 	stackend = end_of_stack(tsk);
-	if (tsk != &init_task && *stackend != STACK_END_MAGIC)
+	if (*stackend != STACK_END_MAGIC)
 		printk(KERN_EMERG "Thread overran stack, or stack corrupted\n");
 
 	tsk->thread.cr2		= address;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5c2c885..4dee9d7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -57,6 +57,7 @@ struct sched_param {
 #include <linux/llist.h>
 #include <linux/uidgid.h>
 #include <linux/gfp.h>
+#include <linux/magic.h>
 
 #include <asm/processor.h>
 
@@ -2636,6 +2637,7 @@ static inline unsigned long stack_not_used(struct task_struct *p)
 	return (unsigned long)n - (unsigned long)end_of_stack(p);
 }
 #endif
+extern void task_stack_end_magic(struct task_struct *tsk);
 
 /* set thread flags in other task's structures
  * - see asm/thread_info.h for TIF_xxxx flags available
diff --git a/init/main.c b/init/main.c
index bb1aed9..bcdabf3a 100644
--- a/init/main.c
+++ b/init/main.c
@@ -508,6 +508,7 @@ asmlinkage __visible void __init start_kernel(void)
 	 * lockdep hash:
 	 */
 	lockdep_init();
+	task_stack_end_magic(&init_task);
 	smp_setup_processor_id();
 	debug_objects_early_init();
 
diff --git a/kernel/fork.c b/kernel/fork.c
index 0cf9cdb..67cb1e3 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -294,11 +294,18 @@ int __weak arch_dup_task_struct(struct task_struct *dst,
 	return 0;
 }
 
+void task_stack_end_magic(struct task_struct *tsk)
+{
+	unsigned long *stackend;
+
+	stackend = end_of_stack(tsk);
+	*stackend = STACK_END_MAGIC;	/* for overflow detection */
+}
+
 static struct task_struct *dup_task_struct(struct task_struct *orig)
 {
 	struct task_struct *tsk;
 	struct thread_info *ti;
-	unsigned long *stackend;
 	int node = tsk_fork_get_node(orig);
 	int err;
 
@@ -328,8 +335,7 @@ static struct task_struct *dup_task_struct(struct task_struct *orig)
 	setup_thread_stack(tsk, orig);
 	clear_user_return_notifier(tsk);
 	clear_tsk_need_resched(tsk);
-	stackend = end_of_stack(tsk);
-	*stackend = STACK_END_MAGIC;	/* for overflow detection */
+	task_stack_end_magic(tsk);
 
 #ifdef CONFIG_CC_STACKPROTECTOR
 	tsk->stack_canary = get_random_int();
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 8a4e5cb..1636e41 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -13,7 +13,6 @@
 #include <linux/sysctl.h>
 #include <linux/init.h>
 #include <linux/fs.h>
-#include <linux/magic.h>
 
 #include <asm/setup.h>
 
@@ -171,8 +170,7 @@ check_stack(unsigned long ip, unsigned long *stack)
 			i++;
 	}
 
-	if ((current != &init_task &&
-		*(end_of_stack(current)) != STACK_END_MAGIC)) {
+	if (*end_of_stack(current) != STACK_END_MAGIC) {
 		print_max_stack();
 		BUG();
 	}
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 2/3] sched: Add helper for task stack page overrun checking
  2014-09-08 19:23         ` Aaron Tomlin
@ 2014-09-08 19:23           ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-08 19:23 UTC (permalink / raw)
  To: peterz
  Cc: mingo, dzickus, bmr, jcastillo, atomlin, oleg, pzijlstr, riel,
	linux-kernel, tglx, x86, rostedt, hannes, aneesh.kumar, akpm,
	akpm, linuxppc-dev, minchan

This facility is used in a few places so let's introduce
a helper function to improve code readability.

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 arch/powerpc/mm/fault.c    | 4 +---
 arch/x86/mm/fault.c        | 4 +---
 include/linux/sched.h      | 2 ++
 kernel/trace/trace_stack.c | 2 +-
 4 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 35d0760c..99b2f27 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -507,7 +507,6 @@ bail:
 void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 {
 	const struct exception_table_entry *entry;
-	unsigned long *stackend;
 
 	/* Are we prepared to handle this fault?  */
 	if ((entry = search_exception_tables(regs->nip)) != NULL) {
@@ -536,8 +535,7 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 	printk(KERN_ALERT "Faulting instruction address: 0x%08lx\n",
 		regs->nip);
 
-	stackend = end_of_stack(current);
-	if (*stackend != STACK_END_MAGIC)
+	if (task_stack_end_corrupted(current))
 		printk(KERN_ALERT "Thread overran stack, or stack corrupted\n");
 
 	die("Kernel access of bad area", regs, sig);
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index bc23a70..6240bc7 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -648,7 +648,6 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 	   unsigned long address, int signal, int si_code)
 {
 	struct task_struct *tsk = current;
-	unsigned long *stackend;
 	unsigned long flags;
 	int sig;
 
@@ -708,8 +707,7 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 
 	show_fault_oops(regs, error_code, address);
 
-	stackend = end_of_stack(tsk);
-	if (*stackend != STACK_END_MAGIC)
+	if (task_stack_end_corrupted(tsk))
 		printk(KERN_EMERG "Thread overran stack, or stack corrupted\n");
 
 	tsk->thread.cr2		= address;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 4dee9d7..6e07cb9 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2615,6 +2615,8 @@ static inline unsigned long *end_of_stack(struct task_struct *p)
 }
 
 #endif
+#define task_stack_end_corrupted(task) \
+		(*(end_of_stack(task)) != STACK_END_MAGIC)
 
 static inline int object_is_on_stack(void *obj)
 {
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 1636e41..16eddb3 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -170,7 +170,7 @@ check_stack(unsigned long ip, unsigned long *stack)
 			i++;
 	}
 
-	if (*end_of_stack(current) != STACK_END_MAGIC) {
+	if (task_stack_end_corrupted(current)) {
 		print_max_stack();
 		BUG();
 	}
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 2/3] sched: Add helper for task stack page overrun checking
@ 2014-09-08 19:23           ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-08 19:23 UTC (permalink / raw)
  To: peterz
  Cc: dzickus, jcastillo, riel, minchan, bmr, x86, oleg, rostedt,
	linux-kernel, hannes, mingo, aneesh.kumar, akpm, atomlin, tglx,
	linuxppc-dev, akpm, pzijlstr

This facility is used in a few places so let's introduce
a helper function to improve code readability.

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 arch/powerpc/mm/fault.c    | 4 +---
 arch/x86/mm/fault.c        | 4 +---
 include/linux/sched.h      | 2 ++
 kernel/trace/trace_stack.c | 2 +-
 4 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 35d0760c..99b2f27 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -507,7 +507,6 @@ bail:
 void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 {
 	const struct exception_table_entry *entry;
-	unsigned long *stackend;
 
 	/* Are we prepared to handle this fault?  */
 	if ((entry = search_exception_tables(regs->nip)) != NULL) {
@@ -536,8 +535,7 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 	printk(KERN_ALERT "Faulting instruction address: 0x%08lx\n",
 		regs->nip);
 
-	stackend = end_of_stack(current);
-	if (*stackend != STACK_END_MAGIC)
+	if (task_stack_end_corrupted(current))
 		printk(KERN_ALERT "Thread overran stack, or stack corrupted\n");
 
 	die("Kernel access of bad area", regs, sig);
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index bc23a70..6240bc7 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -648,7 +648,6 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 	   unsigned long address, int signal, int si_code)
 {
 	struct task_struct *tsk = current;
-	unsigned long *stackend;
 	unsigned long flags;
 	int sig;
 
@@ -708,8 +707,7 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 
 	show_fault_oops(regs, error_code, address);
 
-	stackend = end_of_stack(tsk);
-	if (*stackend != STACK_END_MAGIC)
+	if (task_stack_end_corrupted(tsk))
 		printk(KERN_EMERG "Thread overran stack, or stack corrupted\n");
 
 	tsk->thread.cr2		= address;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 4dee9d7..6e07cb9 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2615,6 +2615,8 @@ static inline unsigned long *end_of_stack(struct task_struct *p)
 }
 
 #endif
+#define task_stack_end_corrupted(task) \
+		(*(end_of_stack(task)) != STACK_END_MAGIC)
 
 static inline int object_is_on_stack(void *obj)
 {
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 1636e41..16eddb3 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -170,7 +170,7 @@ check_stack(unsigned long ip, unsigned long *stack)
 			i++;
 	}
 
-	if (*end_of_stack(current) != STACK_END_MAGIC) {
+	if (task_stack_end_corrupted(current)) {
 		print_max_stack();
 		BUG();
 	}
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 3/3] sched: BUG when stack end location is over written
  2014-09-08 19:23         ` Aaron Tomlin
@ 2014-09-08 19:23           ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-08 19:23 UTC (permalink / raw)
  To: peterz
  Cc: mingo, dzickus, bmr, jcastillo, atomlin, oleg, pzijlstr, riel,
	linux-kernel, tglx, x86, rostedt, hannes, aneesh.kumar, akpm,
	akpm, linuxppc-dev, minchan

Currently in the event of a stack overrun a call to schedule()
does not check for this type of corruption. This corruption is
often silent and can go unnoticed. However once the corrupted
region is examined at a later stage, the outcome is undefined
and often results in a sporadic page fault which cannot be
handled.

This patch checks for a stack overrun and takes appropriate
action since the damage is already done, there is no point
in continuing.

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 kernel/sched/core.c |  4 ++++
 lib/Kconfig.debug   | 12 ++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ec1a286..182b2d1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2660,6 +2660,10 @@ static noinline void __schedule_bug(struct task_struct *prev)
  */
 static inline void schedule_debug(struct task_struct *prev)
 {
+#ifdef CONFIG_SCHED_STACK_END_CHECK
+	if (unlikely(task_stack_end_corrupted(prev)))
+		BUG();
+#endif
 	/*
 	 * Test if we are atomic. Since do_exit() needs to call into
 	 * schedule() atomically, we ignore that path. Otherwise whine
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index a285900..2a8280a 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -824,6 +824,18 @@ config SCHEDSTATS
 	  application, you can say N to avoid the very slight overhead
 	  this adds.
 
+config SCHED_STACK_END_CHECK
+	bool "Detect stack corruption on calls to schedule()"
+	depends on DEBUG_KERNEL
+	default y
+	help
+	  This option checks for a stack overrun on calls to schedule().
+	  If the stack end location is found to be over written always panic as
+	  the content of the corrupted region can no longer be trusted.
+	  This is to ensure no erroneous behaviour occurs which could result in
+	  data corruption or a sporadic crash at a later stage once the region
+	  is examined. The runtime overhead introduced is minimal.
+
 config TIMER_STATS
 	bool "Collect kernel timers statistics"
 	depends on DEBUG_KERNEL && PROC_FS
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH 3/3] sched: BUG when stack end location is over written
@ 2014-09-08 19:23           ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-08 19:23 UTC (permalink / raw)
  To: peterz
  Cc: dzickus, jcastillo, riel, minchan, bmr, x86, oleg, rostedt,
	linux-kernel, hannes, mingo, aneesh.kumar, akpm, atomlin, tglx,
	linuxppc-dev, akpm, pzijlstr

Currently in the event of a stack overrun a call to schedule()
does not check for this type of corruption. This corruption is
often silent and can go unnoticed. However once the corrupted
region is examined at a later stage, the outcome is undefined
and often results in a sporadic page fault which cannot be
handled.

This patch checks for a stack overrun and takes appropriate
action since the damage is already done, there is no point
in continuing.

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 kernel/sched/core.c |  4 ++++
 lib/Kconfig.debug   | 12 ++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ec1a286..182b2d1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2660,6 +2660,10 @@ static noinline void __schedule_bug(struct task_struct *prev)
  */
 static inline void schedule_debug(struct task_struct *prev)
 {
+#ifdef CONFIG_SCHED_STACK_END_CHECK
+	if (unlikely(task_stack_end_corrupted(prev)))
+		BUG();
+#endif
 	/*
 	 * Test if we are atomic. Since do_exit() needs to call into
 	 * schedule() atomically, we ignore that path. Otherwise whine
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index a285900..2a8280a 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -824,6 +824,18 @@ config SCHEDSTATS
 	  application, you can say N to avoid the very slight overhead
 	  this adds.
 
+config SCHED_STACK_END_CHECK
+	bool "Detect stack corruption on calls to schedule()"
+	depends on DEBUG_KERNEL
+	default y
+	help
+	  This option checks for a stack overrun on calls to schedule().
+	  If the stack end location is found to be over written always panic as
+	  the content of the corrupted region can no longer be trusted.
+	  This is to ensure no erroneous behaviour occurs which could result in
+	  data corruption or a sporadic crash at a later stage once the region
+	  is examined. The runtime overhead introduced is minimal.
+
 config TIMER_STATS
 	bool "Collect kernel timers statistics"
 	depends on DEBUG_KERNEL && PROC_FS
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v2 0/3] sched: Always check the integrity of the canary
  2014-09-04 16:11       ` Aaron Tomlin
@ 2014-09-09  9:42         ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-09  9:42 UTC (permalink / raw)
  To: peterz
  Cc: mingo, dzickus, bmr, jcastillo, atomlin, oleg, pzijlstr, riel,
	prarit, jgh, linux-kernel, tglx, x86, rostedt, hannes,
	aneesh.kumar, akpm, akpm, linuxppc-dev, minchan

Resending with "v2" added to each subject line:


Currently in the event of a stack overrun a call to schedule()
does not check for this type of corruption. This corruption is
often silent and can go unnoticed. However once the corrupted
region is examined at a later stage, the outcome is undefined
and often results in a sporadic page fault which cannot be
handled.

The first patch adds a canary to init_task's end of stack.
While the second patch provides a helper to determine the
integrity of the canary. The third checks for a stack
overrun and takes appropriate action since the damage
is already done, there is no point in continuing.

Changes since v1:

 * Rebased against v3.17-rc4
 * Add a canary to init_task - Oleg Nesterov
 * Fix various code formatting issues - Peter Zijlstra
 * Introduce Kconfig option - Peter Zijlstra

Aaron Tomlin (3):
  init/main.c: Give init_task a canary
  sched: Add helper for task stack page overrun checking
  sched: BUG when stack end location is over written

 arch/powerpc/mm/fault.c    |  5 +----
 arch/x86/mm/fault.c        |  5 +----
 include/linux/sched.h      |  4 ++++
 init/main.c                |  1 +
 kernel/fork.c              | 12 +++++++++---
 kernel/sched/core.c        |  4 ++++
 kernel/trace/trace_stack.c |  4 +---
 lib/Kconfig.debug          | 12 ++++++++++++
 8 files changed, 33 insertions(+), 14 deletions(-)

-- 
1.9.3


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v2 0/3] sched: Always check the integrity of the canary
@ 2014-09-09  9:42         ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-09  9:42 UTC (permalink / raw)
  To: peterz
  Cc: dzickus, jcastillo, riel, x86, akpm, minchan, bmr, prarit, oleg,
	rostedt, linux-kernel, hannes, mingo, aneesh.kumar, akpm,
	atomlin, jgh, linuxppc-dev, tglx, pzijlstr

Resending with "v2" added to each subject line:


Currently in the event of a stack overrun a call to schedule()
does not check for this type of corruption. This corruption is
often silent and can go unnoticed. However once the corrupted
region is examined at a later stage, the outcome is undefined
and often results in a sporadic page fault which cannot be
handled.

The first patch adds a canary to init_task's end of stack.
While the second patch provides a helper to determine the
integrity of the canary. The third checks for a stack
overrun and takes appropriate action since the damage
is already done, there is no point in continuing.

Changes since v1:

 * Rebased against v3.17-rc4
 * Add a canary to init_task - Oleg Nesterov
 * Fix various code formatting issues - Peter Zijlstra
 * Introduce Kconfig option - Peter Zijlstra

Aaron Tomlin (3):
  init/main.c: Give init_task a canary
  sched: Add helper for task stack page overrun checking
  sched: BUG when stack end location is over written

 arch/powerpc/mm/fault.c    |  5 +----
 arch/x86/mm/fault.c        |  5 +----
 include/linux/sched.h      |  4 ++++
 init/main.c                |  1 +
 kernel/fork.c              | 12 +++++++++---
 kernel/sched/core.c        |  4 ++++
 kernel/trace/trace_stack.c |  4 +---
 lib/Kconfig.debug          | 12 ++++++++++++
 8 files changed, 33 insertions(+), 14 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v2 1/3] init/main.c: Give init_task a canary
  2014-09-09  9:42         ` Aaron Tomlin
@ 2014-09-09  9:42           ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-09  9:42 UTC (permalink / raw)
  To: peterz
  Cc: mingo, dzickus, bmr, jcastillo, atomlin, oleg, pzijlstr, riel,
	prarit, jgh, linux-kernel, tglx, x86, rostedt, hannes,
	aneesh.kumar, akpm, akpm, linuxppc-dev, minchan

Tasks get their end of stack set to STACK_END_MAGIC with the
aim to catch stack overruns. Currently this feature does not
apply to init_task. This patch removes this restriction.

Note that a similar patch was posted by Prarit Bhargava [1]
some time ago but was never merged.

[1]: http://marc.info/?l=linux-kernel&m=127144305403241&w=2

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 arch/powerpc/mm/fault.c    |  3 +--
 arch/x86/mm/fault.c        |  3 +--
 include/linux/sched.h      |  2 ++
 init/main.c                |  1 +
 kernel/fork.c              | 12 +++++++++---
 kernel/trace/trace_stack.c |  4 +---
 6 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 51ab9e7..35d0760c 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -30,7 +30,6 @@
 #include <linux/kprobes.h>
 #include <linux/kdebug.h>
 #include <linux/perf_event.h>
-#include <linux/magic.h>
 #include <linux/ratelimit.h>
 #include <linux/context_tracking.h>
 
@@ -538,7 +537,7 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 		regs->nip);
 
 	stackend = end_of_stack(current);
-	if (current != &init_task && *stackend != STACK_END_MAGIC)
+	if (*stackend != STACK_END_MAGIC)
 		printk(KERN_ALERT "Thread overran stack, or stack corrupted\n");
 
 	die("Kernel access of bad area", regs, sig);
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index a241946..bc23a70 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -3,7 +3,6 @@
  *  Copyright (C) 2001, 2002 Andi Kleen, SuSE Labs.
  *  Copyright (C) 2008-2009, Red Hat Inc., Ingo Molnar
  */
-#include <linux/magic.h>		/* STACK_END_MAGIC		*/
 #include <linux/sched.h>		/* test_thread_flag(), ...	*/
 #include <linux/kdebug.h>		/* oops_begin/end, ...		*/
 #include <linux/module.h>		/* search_exception_table	*/
@@ -710,7 +709,7 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 	show_fault_oops(regs, error_code, address);
 
 	stackend = end_of_stack(tsk);
-	if (tsk != &init_task && *stackend != STACK_END_MAGIC)
+	if (*stackend != STACK_END_MAGIC)
 		printk(KERN_EMERG "Thread overran stack, or stack corrupted\n");
 
 	tsk->thread.cr2		= address;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5c2c885..4dee9d7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -57,6 +57,7 @@ struct sched_param {
 #include <linux/llist.h>
 #include <linux/uidgid.h>
 #include <linux/gfp.h>
+#include <linux/magic.h>
 
 #include <asm/processor.h>
 
@@ -2636,6 +2637,7 @@ static inline unsigned long stack_not_used(struct task_struct *p)
 	return (unsigned long)n - (unsigned long)end_of_stack(p);
 }
 #endif
+extern void task_stack_end_magic(struct task_struct *tsk);
 
 /* set thread flags in other task's structures
  * - see asm/thread_info.h for TIF_xxxx flags available
diff --git a/init/main.c b/init/main.c
index bb1aed9..bcdabf3a 100644
--- a/init/main.c
+++ b/init/main.c
@@ -508,6 +508,7 @@ asmlinkage __visible void __init start_kernel(void)
 	 * lockdep hash:
 	 */
 	lockdep_init();
+	task_stack_end_magic(&init_task);
 	smp_setup_processor_id();
 	debug_objects_early_init();
 
diff --git a/kernel/fork.c b/kernel/fork.c
index 0cf9cdb..67cb1e3 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -294,11 +294,18 @@ int __weak arch_dup_task_struct(struct task_struct *dst,
 	return 0;
 }
 
+void task_stack_end_magic(struct task_struct *tsk)
+{
+	unsigned long *stackend;
+
+	stackend = end_of_stack(tsk);
+	*stackend = STACK_END_MAGIC;	/* for overflow detection */
+}
+
 static struct task_struct *dup_task_struct(struct task_struct *orig)
 {
 	struct task_struct *tsk;
 	struct thread_info *ti;
-	unsigned long *stackend;
 	int node = tsk_fork_get_node(orig);
 	int err;
 
@@ -328,8 +335,7 @@ static struct task_struct *dup_task_struct(struct task_struct *orig)
 	setup_thread_stack(tsk, orig);
 	clear_user_return_notifier(tsk);
 	clear_tsk_need_resched(tsk);
-	stackend = end_of_stack(tsk);
-	*stackend = STACK_END_MAGIC;	/* for overflow detection */
+	task_stack_end_magic(tsk);
 
 #ifdef CONFIG_CC_STACKPROTECTOR
 	tsk->stack_canary = get_random_int();
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 8a4e5cb..1636e41 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -13,7 +13,6 @@
 #include <linux/sysctl.h>
 #include <linux/init.h>
 #include <linux/fs.h>
-#include <linux/magic.h>
 
 #include <asm/setup.h>
 
@@ -171,8 +170,7 @@ check_stack(unsigned long ip, unsigned long *stack)
 			i++;
 	}
 
-	if ((current != &init_task &&
-		*(end_of_stack(current)) != STACK_END_MAGIC)) {
+	if (*end_of_stack(current) != STACK_END_MAGIC) {
 		print_max_stack();
 		BUG();
 	}
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v2 1/3] init/main.c: Give init_task a canary
@ 2014-09-09  9:42           ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-09  9:42 UTC (permalink / raw)
  To: peterz
  Cc: dzickus, jcastillo, riel, x86, akpm, minchan, bmr, prarit, oleg,
	rostedt, linux-kernel, hannes, mingo, aneesh.kumar, akpm,
	atomlin, jgh, linuxppc-dev, tglx, pzijlstr

Tasks get their end of stack set to STACK_END_MAGIC with the
aim to catch stack overruns. Currently this feature does not
apply to init_task. This patch removes this restriction.

Note that a similar patch was posted by Prarit Bhargava [1]
some time ago but was never merged.

[1]: http://marc.info/?l=linux-kernel&m=127144305403241&w=2

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 arch/powerpc/mm/fault.c    |  3 +--
 arch/x86/mm/fault.c        |  3 +--
 include/linux/sched.h      |  2 ++
 init/main.c                |  1 +
 kernel/fork.c              | 12 +++++++++---
 kernel/trace/trace_stack.c |  4 +---
 6 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 51ab9e7..35d0760c 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -30,7 +30,6 @@
 #include <linux/kprobes.h>
 #include <linux/kdebug.h>
 #include <linux/perf_event.h>
-#include <linux/magic.h>
 #include <linux/ratelimit.h>
 #include <linux/context_tracking.h>
 
@@ -538,7 +537,7 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 		regs->nip);
 
 	stackend = end_of_stack(current);
-	if (current != &init_task && *stackend != STACK_END_MAGIC)
+	if (*stackend != STACK_END_MAGIC)
 		printk(KERN_ALERT "Thread overran stack, or stack corrupted\n");
 
 	die("Kernel access of bad area", regs, sig);
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index a241946..bc23a70 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -3,7 +3,6 @@
  *  Copyright (C) 2001, 2002 Andi Kleen, SuSE Labs.
  *  Copyright (C) 2008-2009, Red Hat Inc., Ingo Molnar
  */
-#include <linux/magic.h>		/* STACK_END_MAGIC		*/
 #include <linux/sched.h>		/* test_thread_flag(), ...	*/
 #include <linux/kdebug.h>		/* oops_begin/end, ...		*/
 #include <linux/module.h>		/* search_exception_table	*/
@@ -710,7 +709,7 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 	show_fault_oops(regs, error_code, address);
 
 	stackend = end_of_stack(tsk);
-	if (tsk != &init_task && *stackend != STACK_END_MAGIC)
+	if (*stackend != STACK_END_MAGIC)
 		printk(KERN_EMERG "Thread overran stack, or stack corrupted\n");
 
 	tsk->thread.cr2		= address;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5c2c885..4dee9d7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -57,6 +57,7 @@ struct sched_param {
 #include <linux/llist.h>
 #include <linux/uidgid.h>
 #include <linux/gfp.h>
+#include <linux/magic.h>
 
 #include <asm/processor.h>
 
@@ -2636,6 +2637,7 @@ static inline unsigned long stack_not_used(struct task_struct *p)
 	return (unsigned long)n - (unsigned long)end_of_stack(p);
 }
 #endif
+extern void task_stack_end_magic(struct task_struct *tsk);
 
 /* set thread flags in other task's structures
  * - see asm/thread_info.h for TIF_xxxx flags available
diff --git a/init/main.c b/init/main.c
index bb1aed9..bcdabf3a 100644
--- a/init/main.c
+++ b/init/main.c
@@ -508,6 +508,7 @@ asmlinkage __visible void __init start_kernel(void)
 	 * lockdep hash:
 	 */
 	lockdep_init();
+	task_stack_end_magic(&init_task);
 	smp_setup_processor_id();
 	debug_objects_early_init();
 
diff --git a/kernel/fork.c b/kernel/fork.c
index 0cf9cdb..67cb1e3 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -294,11 +294,18 @@ int __weak arch_dup_task_struct(struct task_struct *dst,
 	return 0;
 }
 
+void task_stack_end_magic(struct task_struct *tsk)
+{
+	unsigned long *stackend;
+
+	stackend = end_of_stack(tsk);
+	*stackend = STACK_END_MAGIC;	/* for overflow detection */
+}
+
 static struct task_struct *dup_task_struct(struct task_struct *orig)
 {
 	struct task_struct *tsk;
 	struct thread_info *ti;
-	unsigned long *stackend;
 	int node = tsk_fork_get_node(orig);
 	int err;
 
@@ -328,8 +335,7 @@ static struct task_struct *dup_task_struct(struct task_struct *orig)
 	setup_thread_stack(tsk, orig);
 	clear_user_return_notifier(tsk);
 	clear_tsk_need_resched(tsk);
-	stackend = end_of_stack(tsk);
-	*stackend = STACK_END_MAGIC;	/* for overflow detection */
+	task_stack_end_magic(tsk);
 
 #ifdef CONFIG_CC_STACKPROTECTOR
 	tsk->stack_canary = get_random_int();
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 8a4e5cb..1636e41 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -13,7 +13,6 @@
 #include <linux/sysctl.h>
 #include <linux/init.h>
 #include <linux/fs.h>
-#include <linux/magic.h>
 
 #include <asm/setup.h>
 
@@ -171,8 +170,7 @@ check_stack(unsigned long ip, unsigned long *stack)
 			i++;
 	}
 
-	if ((current != &init_task &&
-		*(end_of_stack(current)) != STACK_END_MAGIC)) {
+	if (*end_of_stack(current) != STACK_END_MAGIC) {
 		print_max_stack();
 		BUG();
 	}
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v2 2/3] sched: Add helper for task stack page overrun checking
  2014-09-09  9:42         ` Aaron Tomlin
@ 2014-09-09  9:42           ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-09  9:42 UTC (permalink / raw)
  To: peterz
  Cc: mingo, dzickus, bmr, jcastillo, atomlin, oleg, pzijlstr, riel,
	prarit, jgh, linux-kernel, tglx, x86, rostedt, hannes,
	aneesh.kumar, akpm, akpm, linuxppc-dev, minchan

This facility is used in a few places so let's introduce
a helper function to improve code readability.

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 arch/powerpc/mm/fault.c    | 4 +---
 arch/x86/mm/fault.c        | 4 +---
 include/linux/sched.h      | 2 ++
 kernel/trace/trace_stack.c | 2 +-
 4 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 35d0760c..99b2f27 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -507,7 +507,6 @@ bail:
 void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 {
 	const struct exception_table_entry *entry;
-	unsigned long *stackend;
 
 	/* Are we prepared to handle this fault?  */
 	if ((entry = search_exception_tables(regs->nip)) != NULL) {
@@ -536,8 +535,7 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 	printk(KERN_ALERT "Faulting instruction address: 0x%08lx\n",
 		regs->nip);
 
-	stackend = end_of_stack(current);
-	if (*stackend != STACK_END_MAGIC)
+	if (task_stack_end_corrupted(current))
 		printk(KERN_ALERT "Thread overran stack, or stack corrupted\n");
 
 	die("Kernel access of bad area", regs, sig);
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index bc23a70..6240bc7 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -648,7 +648,6 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 	   unsigned long address, int signal, int si_code)
 {
 	struct task_struct *tsk = current;
-	unsigned long *stackend;
 	unsigned long flags;
 	int sig;
 
@@ -708,8 +707,7 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 
 	show_fault_oops(regs, error_code, address);
 
-	stackend = end_of_stack(tsk);
-	if (*stackend != STACK_END_MAGIC)
+	if (task_stack_end_corrupted(tsk))
 		printk(KERN_EMERG "Thread overran stack, or stack corrupted\n");
 
 	tsk->thread.cr2		= address;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 4dee9d7..6e07cb9 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2615,6 +2615,8 @@ static inline unsigned long *end_of_stack(struct task_struct *p)
 }
 
 #endif
+#define task_stack_end_corrupted(task) \
+		(*(end_of_stack(task)) != STACK_END_MAGIC)
 
 static inline int object_is_on_stack(void *obj)
 {
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 1636e41..16eddb3 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -170,7 +170,7 @@ check_stack(unsigned long ip, unsigned long *stack)
 			i++;
 	}
 
-	if (*end_of_stack(current) != STACK_END_MAGIC) {
+	if (task_stack_end_corrupted(current)) {
 		print_max_stack();
 		BUG();
 	}
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v2 2/3] sched: Add helper for task stack page overrun checking
@ 2014-09-09  9:42           ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-09  9:42 UTC (permalink / raw)
  To: peterz
  Cc: dzickus, jcastillo, riel, x86, akpm, minchan, bmr, prarit, oleg,
	rostedt, linux-kernel, hannes, mingo, aneesh.kumar, akpm,
	atomlin, jgh, linuxppc-dev, tglx, pzijlstr

This facility is used in a few places so let's introduce
a helper function to improve code readability.

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 arch/powerpc/mm/fault.c    | 4 +---
 arch/x86/mm/fault.c        | 4 +---
 include/linux/sched.h      | 2 ++
 kernel/trace/trace_stack.c | 2 +-
 4 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 35d0760c..99b2f27 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -507,7 +507,6 @@ bail:
 void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 {
 	const struct exception_table_entry *entry;
-	unsigned long *stackend;
 
 	/* Are we prepared to handle this fault?  */
 	if ((entry = search_exception_tables(regs->nip)) != NULL) {
@@ -536,8 +535,7 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 	printk(KERN_ALERT "Faulting instruction address: 0x%08lx\n",
 		regs->nip);
 
-	stackend = end_of_stack(current);
-	if (*stackend != STACK_END_MAGIC)
+	if (task_stack_end_corrupted(current))
 		printk(KERN_ALERT "Thread overran stack, or stack corrupted\n");
 
 	die("Kernel access of bad area", regs, sig);
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index bc23a70..6240bc7 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -648,7 +648,6 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 	   unsigned long address, int signal, int si_code)
 {
 	struct task_struct *tsk = current;
-	unsigned long *stackend;
 	unsigned long flags;
 	int sig;
 
@@ -708,8 +707,7 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 
 	show_fault_oops(regs, error_code, address);
 
-	stackend = end_of_stack(tsk);
-	if (*stackend != STACK_END_MAGIC)
+	if (task_stack_end_corrupted(tsk))
 		printk(KERN_EMERG "Thread overran stack, or stack corrupted\n");
 
 	tsk->thread.cr2		= address;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 4dee9d7..6e07cb9 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2615,6 +2615,8 @@ static inline unsigned long *end_of_stack(struct task_struct *p)
 }
 
 #endif
+#define task_stack_end_corrupted(task) \
+		(*(end_of_stack(task)) != STACK_END_MAGIC)
 
 static inline int object_is_on_stack(void *obj)
 {
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 1636e41..16eddb3 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -170,7 +170,7 @@ check_stack(unsigned long ip, unsigned long *stack)
 			i++;
 	}
 
-	if (*end_of_stack(current) != STACK_END_MAGIC) {
+	if (task_stack_end_corrupted(current)) {
 		print_max_stack();
 		BUG();
 	}
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v2 3/3] sched: BUG when stack end location is over written
  2014-09-09  9:42         ` Aaron Tomlin
@ 2014-09-09  9:42           ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-09  9:42 UTC (permalink / raw)
  To: peterz
  Cc: mingo, dzickus, bmr, jcastillo, atomlin, oleg, pzijlstr, riel,
	prarit, jgh, linux-kernel, tglx, x86, rostedt, hannes,
	aneesh.kumar, akpm, akpm, linuxppc-dev, minchan

Currently in the event of a stack overrun a call to schedule()
does not check for this type of corruption. This corruption is
often silent and can go unnoticed. However once the corrupted
region is examined at a later stage, the outcome is undefined
and often results in a sporadic page fault which cannot be
handled.

This patch checks for a stack overrun and takes appropriate
action since the damage is already done, there is no point
in continuing.

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 kernel/sched/core.c |  4 ++++
 lib/Kconfig.debug   | 12 ++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ec1a286..182b2d1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2660,6 +2660,10 @@ static noinline void __schedule_bug(struct task_struct *prev)
  */
 static inline void schedule_debug(struct task_struct *prev)
 {
+#ifdef CONFIG_SCHED_STACK_END_CHECK
+	if (unlikely(task_stack_end_corrupted(prev)))
+		BUG();
+#endif
 	/*
 	 * Test if we are atomic. Since do_exit() needs to call into
 	 * schedule() atomically, we ignore that path. Otherwise whine
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index a285900..2a8280a 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -824,6 +824,18 @@ config SCHEDSTATS
 	  application, you can say N to avoid the very slight overhead
 	  this adds.
 
+config SCHED_STACK_END_CHECK
+	bool "Detect stack corruption on calls to schedule()"
+	depends on DEBUG_KERNEL
+	default y
+	help
+	  This option checks for a stack overrun on calls to schedule().
+	  If the stack end location is found to be over written always panic as
+	  the content of the corrupted region can no longer be trusted.
+	  This is to ensure no erroneous behaviour occurs which could result in
+	  data corruption or a sporadic crash at a later stage once the region
+	  is examined. The runtime overhead introduced is minimal.
+
 config TIMER_STATS
 	bool "Collect kernel timers statistics"
 	depends on DEBUG_KERNEL && PROC_FS
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v2 3/3] sched: BUG when stack end location is over written
@ 2014-09-09  9:42           ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-09  9:42 UTC (permalink / raw)
  To: peterz
  Cc: dzickus, jcastillo, riel, x86, akpm, minchan, bmr, prarit, oleg,
	rostedt, linux-kernel, hannes, mingo, aneesh.kumar, akpm,
	atomlin, jgh, linuxppc-dev, tglx, pzijlstr

Currently in the event of a stack overrun a call to schedule()
does not check for this type of corruption. This corruption is
often silent and can go unnoticed. However once the corrupted
region is examined at a later stage, the outcome is undefined
and often results in a sporadic page fault which cannot be
handled.

This patch checks for a stack overrun and takes appropriate
action since the damage is already done, there is no point
in continuing.

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 kernel/sched/core.c |  4 ++++
 lib/Kconfig.debug   | 12 ++++++++++++
 2 files changed, 16 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ec1a286..182b2d1 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2660,6 +2660,10 @@ static noinline void __schedule_bug(struct task_struct *prev)
  */
 static inline void schedule_debug(struct task_struct *prev)
 {
+#ifdef CONFIG_SCHED_STACK_END_CHECK
+	if (unlikely(task_stack_end_corrupted(prev)))
+		BUG();
+#endif
 	/*
 	 * Test if we are atomic. Since do_exit() needs to call into
 	 * schedule() atomically, we ignore that path. Otherwise whine
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index a285900..2a8280a 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -824,6 +824,18 @@ config SCHEDSTATS
 	  application, you can say N to avoid the very slight overhead
 	  this adds.
 
+config SCHED_STACK_END_CHECK
+	bool "Detect stack corruption on calls to schedule()"
+	depends on DEBUG_KERNEL
+	default y
+	help
+	  This option checks for a stack overrun on calls to schedule().
+	  If the stack end location is found to be over written always panic as
+	  the content of the corrupted region can no longer be trusted.
+	  This is to ensure no erroneous behaviour occurs which could result in
+	  data corruption or a sporadic crash at a later stage once the region
+	  is examined. The runtime overhead introduced is minimal.
+
 config TIMER_STATS
 	bool "Collect kernel timers statistics"
 	depends on DEBUG_KERNEL && PROC_FS
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 1/3] init/main.c: Give init_task a canary
  2014-09-09  9:42           ` Aaron Tomlin
@ 2014-09-10  7:26             ` Chuck Ebbert
  -1 siblings, 0 replies; 74+ messages in thread
From: Chuck Ebbert @ 2014-09-10  7:26 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: peterz, mingo, dzickus, bmr, jcastillo, oleg, pzijlstr, riel,
	prarit, jgh, linux-kernel, tglx, x86, rostedt, hannes,
	aneesh.kumar, akpm, akpm, linuxppc-dev, minchan

On Tue,  9 Sep 2014 10:42:27 +0100
Aaron Tomlin <atomlin@redhat.com> wrote:

> +void task_stack_end_magic(struct task_struct *tsk)
> +{
> +	unsigned long *stackend;
> +
> +	stackend = end_of_stack(tsk);
> +	*stackend = STACK_END_MAGIC;	/* for overflow detection */
> +}
> +

For clarity this should probably be called set_task_stack_end_magic().

And has this been tested on parisc and metag, which use STACK_GROWSUP ?
I can't see how end_of_stack() as it's defined now could work on those archs.



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 1/3] init/main.c: Give init_task a canary
@ 2014-09-10  7:26             ` Chuck Ebbert
  0 siblings, 0 replies; 74+ messages in thread
From: Chuck Ebbert @ 2014-09-10  7:26 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: dzickus, jcastillo, riel, x86, akpm, peterz, bmr, prarit, oleg,
	rostedt, linux-kernel, minchan, mingo, aneesh.kumar, akpm,
	hannes, jgh, linuxppc-dev, tglx, pzijlstr

On Tue,  9 Sep 2014 10:42:27 +0100
Aaron Tomlin <atomlin@redhat.com> wrote:

> +void task_stack_end_magic(struct task_struct *tsk)
> +{
> +	unsigned long *stackend;
> +
> +	stackend = end_of_stack(tsk);
> +	*stackend = STACK_END_MAGIC;	/* for overflow detection */
> +}
> +

For clarity this should probably be called set_task_stack_end_magic().

And has this been tested on parisc and metag, which use STACK_GROWSUP ?
I can't see how end_of_stack() as it's defined now could work on those archs.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 1/3] init/main.c: Give init_task a canary
  2014-09-10  7:26             ` Chuck Ebbert
@ 2014-09-10 13:29               ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-10 13:29 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: peterz, mingo, dzickus, bmr, jcastillo, oleg, pzijlstr, riel,
	prarit, jgh, linux-kernel, tglx, x86, rostedt, hannes,
	aneesh.kumar, akpm, akpm, linuxppc-dev, minchan

On Wed, Sep 10, 2014 at 02:26:54AM -0500, Chuck Ebbert wrote:
> On Tue,  9 Sep 2014 10:42:27 +0100
> Aaron Tomlin <atomlin@redhat.com> wrote:
> 
> > +void task_stack_end_magic(struct task_struct *tsk)
> > +{
> > +	unsigned long *stackend;
> > +
> > +	stackend = end_of_stack(tsk);
> > +	*stackend = STACK_END_MAGIC;	/* for overflow detection */
> > +}
> > +
> 
> For clarity this should probably be called set_task_stack_end_magic().

Agreed.

> And has this been tested on parisc and metag, which use STACK_GROWSUP ?
> I can't see how end_of_stack() as it's defined now could work on those archs.

AFAIU, dup_task_struct() has always done this explicitly.
I see no reason why init_task requires special attention.

-- 
Aaron Tomlin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 1/3] init/main.c: Give init_task a canary
@ 2014-09-10 13:29               ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-10 13:29 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: dzickus, jcastillo, riel, x86, akpm, peterz, bmr, prarit, oleg,
	rostedt, linux-kernel, minchan, mingo, aneesh.kumar, akpm,
	hannes, jgh, linuxppc-dev, tglx, pzijlstr

On Wed, Sep 10, 2014 at 02:26:54AM -0500, Chuck Ebbert wrote:
> On Tue,  9 Sep 2014 10:42:27 +0100
> Aaron Tomlin <atomlin@redhat.com> wrote:
> 
> > +void task_stack_end_magic(struct task_struct *tsk)
> > +{
> > +	unsigned long *stackend;
> > +
> > +	stackend = end_of_stack(tsk);
> > +	*stackend = STACK_END_MAGIC;	/* for overflow detection */
> > +}
> > +
> 
> For clarity this should probably be called set_task_stack_end_magic().

Agreed.

> And has this been tested on parisc and metag, which use STACK_GROWSUP ?
> I can't see how end_of_stack() as it's defined now could work on those archs.

AFAIU, dup_task_struct() has always done this explicitly.
I see no reason why init_task requires special attention.

-- 
Aaron Tomlin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 1/3] init/main.c: Give init_task a canary
  2014-09-10 13:29               ` Aaron Tomlin
@ 2014-09-11 12:23                 ` Chuck Ebbert
  -1 siblings, 0 replies; 74+ messages in thread
From: Chuck Ebbert @ 2014-09-11 12:23 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: peterz, mingo, dzickus, bmr, jcastillo, oleg, pzijlstr, riel,
	prarit, jgh, linux-kernel, tglx, x86, rostedt, hannes,
	aneesh.kumar, akpm, akpm, linuxppc-dev, minchan

On Wed, 10 Sep 2014 14:29:33 +0100
Aaron Tomlin <atomlin@redhat.com> wrote:

> On Wed, Sep 10, 2014 at 02:26:54AM -0500, Chuck Ebbert wrote:
> > And has this been tested on parisc and metag, which use STACK_GROWSUP ?
> > I can't see how end_of_stack() as it's defined now could work on those archs.
> 
> AFAIU, dup_task_struct() has always done this explicitly.
> I see no reason why init_task requires special attention.
> 

I guess what I'm saying is, I can't see how the stack canary could have ever
detected any problems on parisc and metag. How did you test your patches on x86?
Maybe someone could run tests on those other archs.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 1/3] init/main.c: Give init_task a canary
@ 2014-09-11 12:23                 ` Chuck Ebbert
  0 siblings, 0 replies; 74+ messages in thread
From: Chuck Ebbert @ 2014-09-11 12:23 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: dzickus, jcastillo, riel, x86, akpm, peterz, bmr, prarit, oleg,
	rostedt, linux-kernel, minchan, mingo, aneesh.kumar, akpm,
	hannes, jgh, linuxppc-dev, tglx, pzijlstr

On Wed, 10 Sep 2014 14:29:33 +0100
Aaron Tomlin <atomlin@redhat.com> wrote:

> On Wed, Sep 10, 2014 at 02:26:54AM -0500, Chuck Ebbert wrote:
> > And has this been tested on parisc and metag, which use STACK_GROWSUP ?
> > I can't see how end_of_stack() as it's defined now could work on those archs.
> 
> AFAIU, dup_task_struct() has always done this explicitly.
> I see no reason why init_task requires special attention.
> 

I guess what I'm saying is, I can't see how the stack canary could have ever
detected any problems on parisc and metag. How did you test your patches on x86?
Maybe someone could run tests on those other archs.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 1/3] init/main.c: Give init_task a canary
  2014-09-11 12:23                 ` Chuck Ebbert
@ 2014-09-11 14:47                   ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-11 14:47 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: peterz, mingo, dzickus, bmr, jcastillo, oleg, pzijlstr, riel,
	prarit, jgh, linux-kernel, tglx, x86, rostedt, hannes,
	aneesh.kumar, akpm, akpm, linuxppc-dev, minchan

On Thu, Sep 11, 2014 at 07:23:45AM -0500, Chuck Ebbert wrote:
> On Wed, 10 Sep 2014 14:29:33 +0100
> Aaron Tomlin <atomlin@redhat.com> wrote:
> 
> > On Wed, Sep 10, 2014 at 02:26:54AM -0500, Chuck Ebbert wrote:
> > > And has this been tested on parisc and metag, which use STACK_GROWSUP ?
> > > I can't see how end_of_stack() as it's defined now could work on those archs.
> > 
> > AFAIU, dup_task_struct() has always done this explicitly.
> > I see no reason why init_task requires special attention.
> > 
> 
> I guess what I'm saying is, I can't see how the stack canary could have ever
> detected any problems on parisc and metag. How did you test your patches on x86?

Yes - under x86 only.

> Maybe someone could run tests on those other archs.

As I said, I can't see why it wouldn't work given
dup_task_struct()'s behaviour.

-- 
Aaron Tomlin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v2 1/3] init/main.c: Give init_task a canary
@ 2014-09-11 14:47                   ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-11 14:47 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: dzickus, jcastillo, riel, x86, akpm, peterz, bmr, prarit, oleg,
	rostedt, linux-kernel, minchan, mingo, aneesh.kumar, akpm,
	hannes, jgh, linuxppc-dev, tglx, pzijlstr

On Thu, Sep 11, 2014 at 07:23:45AM -0500, Chuck Ebbert wrote:
> On Wed, 10 Sep 2014 14:29:33 +0100
> Aaron Tomlin <atomlin@redhat.com> wrote:
> 
> > On Wed, Sep 10, 2014 at 02:26:54AM -0500, Chuck Ebbert wrote:
> > > And has this been tested on parisc and metag, which use STACK_GROWSUP ?
> > > I can't see how end_of_stack() as it's defined now could work on those archs.
> > 
> > AFAIU, dup_task_struct() has always done this explicitly.
> > I see no reason why init_task requires special attention.
> > 
> 
> I guess what I'm saying is, I can't see how the stack canary could have ever
> detected any problems on parisc and metag. How did you test your patches on x86?

Yes - under x86 only.

> Maybe someone could run tests on those other archs.

As I said, I can't see why it wouldn't work given
dup_task_struct()'s behaviour.

-- 
Aaron Tomlin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 0/3] sched: Always check the integrity of the canary
  2014-09-09  9:42         ` Aaron Tomlin
@ 2014-09-11 15:41           ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-11 15:41 UTC (permalink / raw)
  To: peterz
  Cc: mingo, mingo, dzickus, bmr, jcastillo, atomlin, oleg, pzijlstr,
	riel, prarit, jgh, linux-kernel, tglx, x86, rostedt, hannes,
	aneesh.kumar, akpm, akpm, linuxppc-dev, minchan

Hi Peter,

Please let me know if this iteration is satisfactory. Thanks.


Currently in the event of a stack overrun a call to schedule()
does not check for this type of corruption. This corruption is
often silent and can go unnoticed. However once the corrupted
region is examined at a later stage, the outcome is undefined
and often results in a sporadic page fault which cannot be
handled.

The first patch adds a canary to init_task's end of stack.
While the second patch provides a helper to determine the
integrity of the canary. The third checks for a stack
overrun and takes appropriate action since the damage
is already done, there is no point in continuing.

Changes since v2:

 * Use BUG_ON in schedule_debug() - Peter Zijlstra
 * Use a more explicit function
   name for setting the canary - Chuck Ebbert

Changes since v1:

 * Rebased against v3.17-rc4
 * Add a canary to init_task - Oleg Nesterov
 * Fix various code formatting issues - Peter Zijlstra
 * Introduce Kconfig option - Peter Zijlstra

Aaron Tomlin (3):
  init/main.c: Give init_task a canary
  sched: Add helper for task stack page overrun checking
  sched: BUG when stack end location is over written

 arch/powerpc/mm/fault.c    |  5 +----
 arch/x86/mm/fault.c        |  5 +----
 include/linux/sched.h      |  4 ++++
 init/main.c                |  1 +
 kernel/fork.c              | 12 +++++++++---
 kernel/sched/core.c        |  3 +++
 kernel/trace/trace_stack.c |  4 +---
 lib/Kconfig.debug          | 12 ++++++++++++
 8 files changed, 32 insertions(+), 14 deletions(-)

-- 
1.9.3


^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 0/3] sched: Always check the integrity of the canary
@ 2014-09-11 15:41           ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-11 15:41 UTC (permalink / raw)
  To: peterz
  Cc: dzickus, jcastillo, riel, x86, akpm, minchan, mingo, bmr, prarit,
	oleg, rostedt, linux-kernel, hannes, mingo, aneesh.kumar, akpm,
	atomlin, jgh, linuxppc-dev, tglx, pzijlstr

Hi Peter,

Please let me know if this iteration is satisfactory. Thanks.


Currently in the event of a stack overrun a call to schedule()
does not check for this type of corruption. This corruption is
often silent and can go unnoticed. However once the corrupted
region is examined at a later stage, the outcome is undefined
and often results in a sporadic page fault which cannot be
handled.

The first patch adds a canary to init_task's end of stack.
While the second patch provides a helper to determine the
integrity of the canary. The third checks for a stack
overrun and takes appropriate action since the damage
is already done, there is no point in continuing.

Changes since v2:

 * Use BUG_ON in schedule_debug() - Peter Zijlstra
 * Use a more explicit function
   name for setting the canary - Chuck Ebbert

Changes since v1:

 * Rebased against v3.17-rc4
 * Add a canary to init_task - Oleg Nesterov
 * Fix various code formatting issues - Peter Zijlstra
 * Introduce Kconfig option - Peter Zijlstra

Aaron Tomlin (3):
  init/main.c: Give init_task a canary
  sched: Add helper for task stack page overrun checking
  sched: BUG when stack end location is over written

 arch/powerpc/mm/fault.c    |  5 +----
 arch/x86/mm/fault.c        |  5 +----
 include/linux/sched.h      |  4 ++++
 init/main.c                |  1 +
 kernel/fork.c              | 12 +++++++++---
 kernel/sched/core.c        |  3 +++
 kernel/trace/trace_stack.c |  4 +---
 lib/Kconfig.debug          | 12 ++++++++++++
 8 files changed, 32 insertions(+), 14 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 74+ messages in thread

* [PATCH v3 1/3] init/main.c: Give init_task a canary
  2014-09-11 15:41           ` Aaron Tomlin
@ 2014-09-11 15:41             ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-11 15:41 UTC (permalink / raw)
  To: peterz
  Cc: mingo, mingo, dzickus, bmr, jcastillo, atomlin, oleg, pzijlstr,
	riel, prarit, jgh, linux-kernel, tglx, x86, rostedt, hannes,
	aneesh.kumar, akpm, akpm, linuxppc-dev, minchan

Tasks get their end of stack set to STACK_END_MAGIC with the
aim to catch stack overruns. Currently this feature does not
apply to init_task. This patch removes this restriction.

Note that a similar patch was posted by Prarit Bhargava [1]
some time ago but was never merged.

[1]: http://marc.info/?l=linux-kernel&m=127144305403241&w=2

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 arch/powerpc/mm/fault.c    |  3 +--
 arch/x86/mm/fault.c        |  3 +--
 include/linux/sched.h      |  2 ++
 init/main.c                |  1 +
 kernel/fork.c              | 12 +++++++++---
 kernel/trace/trace_stack.c |  4 +---
 6 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 51ab9e7..35d0760c 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -30,7 +30,6 @@
 #include <linux/kprobes.h>
 #include <linux/kdebug.h>
 #include <linux/perf_event.h>
-#include <linux/magic.h>
 #include <linux/ratelimit.h>
 #include <linux/context_tracking.h>
 
@@ -538,7 +537,7 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 		regs->nip);
 
 	stackend = end_of_stack(current);
-	if (current != &init_task && *stackend != STACK_END_MAGIC)
+	if (*stackend != STACK_END_MAGIC)
 		printk(KERN_ALERT "Thread overran stack, or stack corrupted\n");
 
 	die("Kernel access of bad area", regs, sig);
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index a241946..bc23a70 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -3,7 +3,6 @@
  *  Copyright (C) 2001, 2002 Andi Kleen, SuSE Labs.
  *  Copyright (C) 2008-2009, Red Hat Inc., Ingo Molnar
  */
-#include <linux/magic.h>		/* STACK_END_MAGIC		*/
 #include <linux/sched.h>		/* test_thread_flag(), ...	*/
 #include <linux/kdebug.h>		/* oops_begin/end, ...		*/
 #include <linux/module.h>		/* search_exception_table	*/
@@ -710,7 +709,7 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 	show_fault_oops(regs, error_code, address);
 
 	stackend = end_of_stack(tsk);
-	if (tsk != &init_task && *stackend != STACK_END_MAGIC)
+	if (*stackend != STACK_END_MAGIC)
 		printk(KERN_EMERG "Thread overran stack, or stack corrupted\n");
 
 	tsk->thread.cr2		= address;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5c2c885..7ef34b7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -57,6 +57,7 @@ struct sched_param {
 #include <linux/llist.h>
 #include <linux/uidgid.h>
 #include <linux/gfp.h>
+#include <linux/magic.h>
 
 #include <asm/processor.h>
 
@@ -2636,6 +2637,7 @@ static inline unsigned long stack_not_used(struct task_struct *p)
 	return (unsigned long)n - (unsigned long)end_of_stack(p);
 }
 #endif
+extern void set_task_stack_end_magic(struct task_struct *tsk);
 
 /* set thread flags in other task's structures
  * - see asm/thread_info.h for TIF_xxxx flags available
diff --git a/init/main.c b/init/main.c
index bb1aed9..5fc3fc7 100644
--- a/init/main.c
+++ b/init/main.c
@@ -508,6 +508,7 @@ asmlinkage __visible void __init start_kernel(void)
 	 * lockdep hash:
 	 */
 	lockdep_init();
+	set_task_stack_end_magic(&init_task);
 	smp_setup_processor_id();
 	debug_objects_early_init();
 
diff --git a/kernel/fork.c b/kernel/fork.c
index 0cf9cdb..adf9583 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -294,11 +294,18 @@ int __weak arch_dup_task_struct(struct task_struct *dst,
 	return 0;
 }
 
+void set_task_stack_end_magic(struct task_struct *tsk)
+{
+	unsigned long *stackend;
+
+	stackend = end_of_stack(tsk);
+	*stackend = STACK_END_MAGIC;	/* for overflow detection */
+}
+
 static struct task_struct *dup_task_struct(struct task_struct *orig)
 {
 	struct task_struct *tsk;
 	struct thread_info *ti;
-	unsigned long *stackend;
 	int node = tsk_fork_get_node(orig);
 	int err;
 
@@ -328,8 +335,7 @@ static struct task_struct *dup_task_struct(struct task_struct *orig)
 	setup_thread_stack(tsk, orig);
 	clear_user_return_notifier(tsk);
 	clear_tsk_need_resched(tsk);
-	stackend = end_of_stack(tsk);
-	*stackend = STACK_END_MAGIC;	/* for overflow detection */
+	set_task_stack_end_magic(tsk);
 
 #ifdef CONFIG_CC_STACKPROTECTOR
 	tsk->stack_canary = get_random_int();
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 8a4e5cb..1636e41 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -13,7 +13,6 @@
 #include <linux/sysctl.h>
 #include <linux/init.h>
 #include <linux/fs.h>
-#include <linux/magic.h>
 
 #include <asm/setup.h>
 
@@ -171,8 +170,7 @@ check_stack(unsigned long ip, unsigned long *stack)
 			i++;
 	}
 
-	if ((current != &init_task &&
-		*(end_of_stack(current)) != STACK_END_MAGIC)) {
+	if (*end_of_stack(current) != STACK_END_MAGIC) {
 		print_max_stack();
 		BUG();
 	}
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 1/3] init/main.c: Give init_task a canary
@ 2014-09-11 15:41             ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-11 15:41 UTC (permalink / raw)
  To: peterz
  Cc: dzickus, jcastillo, riel, x86, akpm, minchan, mingo, bmr, prarit,
	oleg, rostedt, linux-kernel, hannes, mingo, aneesh.kumar, akpm,
	atomlin, jgh, linuxppc-dev, tglx, pzijlstr

Tasks get their end of stack set to STACK_END_MAGIC with the
aim to catch stack overruns. Currently this feature does not
apply to init_task. This patch removes this restriction.

Note that a similar patch was posted by Prarit Bhargava [1]
some time ago but was never merged.

[1]: http://marc.info/?l=linux-kernel&m=127144305403241&w=2

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 arch/powerpc/mm/fault.c    |  3 +--
 arch/x86/mm/fault.c        |  3 +--
 include/linux/sched.h      |  2 ++
 init/main.c                |  1 +
 kernel/fork.c              | 12 +++++++++---
 kernel/trace/trace_stack.c |  4 +---
 6 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 51ab9e7..35d0760c 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -30,7 +30,6 @@
 #include <linux/kprobes.h>
 #include <linux/kdebug.h>
 #include <linux/perf_event.h>
-#include <linux/magic.h>
 #include <linux/ratelimit.h>
 #include <linux/context_tracking.h>
 
@@ -538,7 +537,7 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 		regs->nip);
 
 	stackend = end_of_stack(current);
-	if (current != &init_task && *stackend != STACK_END_MAGIC)
+	if (*stackend != STACK_END_MAGIC)
 		printk(KERN_ALERT "Thread overran stack, or stack corrupted\n");
 
 	die("Kernel access of bad area", regs, sig);
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index a241946..bc23a70 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -3,7 +3,6 @@
  *  Copyright (C) 2001, 2002 Andi Kleen, SuSE Labs.
  *  Copyright (C) 2008-2009, Red Hat Inc., Ingo Molnar
  */
-#include <linux/magic.h>		/* STACK_END_MAGIC		*/
 #include <linux/sched.h>		/* test_thread_flag(), ...	*/
 #include <linux/kdebug.h>		/* oops_begin/end, ...		*/
 #include <linux/module.h>		/* search_exception_table	*/
@@ -710,7 +709,7 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 	show_fault_oops(regs, error_code, address);
 
 	stackend = end_of_stack(tsk);
-	if (tsk != &init_task && *stackend != STACK_END_MAGIC)
+	if (*stackend != STACK_END_MAGIC)
 		printk(KERN_EMERG "Thread overran stack, or stack corrupted\n");
 
 	tsk->thread.cr2		= address;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 5c2c885..7ef34b7 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -57,6 +57,7 @@ struct sched_param {
 #include <linux/llist.h>
 #include <linux/uidgid.h>
 #include <linux/gfp.h>
+#include <linux/magic.h>
 
 #include <asm/processor.h>
 
@@ -2636,6 +2637,7 @@ static inline unsigned long stack_not_used(struct task_struct *p)
 	return (unsigned long)n - (unsigned long)end_of_stack(p);
 }
 #endif
+extern void set_task_stack_end_magic(struct task_struct *tsk);
 
 /* set thread flags in other task's structures
  * - see asm/thread_info.h for TIF_xxxx flags available
diff --git a/init/main.c b/init/main.c
index bb1aed9..5fc3fc7 100644
--- a/init/main.c
+++ b/init/main.c
@@ -508,6 +508,7 @@ asmlinkage __visible void __init start_kernel(void)
 	 * lockdep hash:
 	 */
 	lockdep_init();
+	set_task_stack_end_magic(&init_task);
 	smp_setup_processor_id();
 	debug_objects_early_init();
 
diff --git a/kernel/fork.c b/kernel/fork.c
index 0cf9cdb..adf9583 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -294,11 +294,18 @@ int __weak arch_dup_task_struct(struct task_struct *dst,
 	return 0;
 }
 
+void set_task_stack_end_magic(struct task_struct *tsk)
+{
+	unsigned long *stackend;
+
+	stackend = end_of_stack(tsk);
+	*stackend = STACK_END_MAGIC;	/* for overflow detection */
+}
+
 static struct task_struct *dup_task_struct(struct task_struct *orig)
 {
 	struct task_struct *tsk;
 	struct thread_info *ti;
-	unsigned long *stackend;
 	int node = tsk_fork_get_node(orig);
 	int err;
 
@@ -328,8 +335,7 @@ static struct task_struct *dup_task_struct(struct task_struct *orig)
 	setup_thread_stack(tsk, orig);
 	clear_user_return_notifier(tsk);
 	clear_tsk_need_resched(tsk);
-	stackend = end_of_stack(tsk);
-	*stackend = STACK_END_MAGIC;	/* for overflow detection */
+	set_task_stack_end_magic(tsk);
 
 #ifdef CONFIG_CC_STACKPROTECTOR
 	tsk->stack_canary = get_random_int();
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 8a4e5cb..1636e41 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -13,7 +13,6 @@
 #include <linux/sysctl.h>
 #include <linux/init.h>
 #include <linux/fs.h>
-#include <linux/magic.h>
 
 #include <asm/setup.h>
 
@@ -171,8 +170,7 @@ check_stack(unsigned long ip, unsigned long *stack)
 			i++;
 	}
 
-	if ((current != &init_task &&
-		*(end_of_stack(current)) != STACK_END_MAGIC)) {
+	if (*end_of_stack(current) != STACK_END_MAGIC) {
 		print_max_stack();
 		BUG();
 	}
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 2/3] sched: Add helper for task stack page overrun checking
  2014-09-11 15:41           ` Aaron Tomlin
@ 2014-09-11 15:41             ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-11 15:41 UTC (permalink / raw)
  To: peterz
  Cc: mingo, mingo, dzickus, bmr, jcastillo, atomlin, oleg, pzijlstr,
	riel, prarit, jgh, linux-kernel, tglx, x86, rostedt, hannes,
	aneesh.kumar, akpm, akpm, linuxppc-dev, minchan

This facility is used in a few places so let's introduce
a helper function to improve code readability.

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 arch/powerpc/mm/fault.c    | 4 +---
 arch/x86/mm/fault.c        | 4 +---
 include/linux/sched.h      | 2 ++
 kernel/trace/trace_stack.c | 2 +-
 4 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 35d0760c..99b2f27 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -507,7 +507,6 @@ bail:
 void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 {
 	const struct exception_table_entry *entry;
-	unsigned long *stackend;
 
 	/* Are we prepared to handle this fault?  */
 	if ((entry = search_exception_tables(regs->nip)) != NULL) {
@@ -536,8 +535,7 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 	printk(KERN_ALERT "Faulting instruction address: 0x%08lx\n",
 		regs->nip);
 
-	stackend = end_of_stack(current);
-	if (*stackend != STACK_END_MAGIC)
+	if (task_stack_end_corrupted(current))
 		printk(KERN_ALERT "Thread overran stack, or stack corrupted\n");
 
 	die("Kernel access of bad area", regs, sig);
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index bc23a70..6240bc7 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -648,7 +648,6 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 	   unsigned long address, int signal, int si_code)
 {
 	struct task_struct *tsk = current;
-	unsigned long *stackend;
 	unsigned long flags;
 	int sig;
 
@@ -708,8 +707,7 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 
 	show_fault_oops(regs, error_code, address);
 
-	stackend = end_of_stack(tsk);
-	if (*stackend != STACK_END_MAGIC)
+	if (task_stack_end_corrupted(tsk))
 		printk(KERN_EMERG "Thread overran stack, or stack corrupted\n");
 
 	tsk->thread.cr2		= address;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 7ef34b7..a80e35d 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2615,6 +2615,8 @@ static inline unsigned long *end_of_stack(struct task_struct *p)
 }
 
 #endif
+#define task_stack_end_corrupted(task) \
+		(*(end_of_stack(task)) != STACK_END_MAGIC)
 
 static inline int object_is_on_stack(void *obj)
 {
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 1636e41..16eddb3 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -170,7 +170,7 @@ check_stack(unsigned long ip, unsigned long *stack)
 			i++;
 	}
 
-	if (*end_of_stack(current) != STACK_END_MAGIC) {
+	if (task_stack_end_corrupted(current)) {
 		print_max_stack();
 		BUG();
 	}
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 2/3] sched: Add helper for task stack page overrun checking
@ 2014-09-11 15:41             ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-11 15:41 UTC (permalink / raw)
  To: peterz
  Cc: dzickus, jcastillo, riel, x86, akpm, minchan, mingo, bmr, prarit,
	oleg, rostedt, linux-kernel, hannes, mingo, aneesh.kumar, akpm,
	atomlin, jgh, linuxppc-dev, tglx, pzijlstr

This facility is used in a few places so let's introduce
a helper function to improve code readability.

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 arch/powerpc/mm/fault.c    | 4 +---
 arch/x86/mm/fault.c        | 4 +---
 include/linux/sched.h      | 2 ++
 kernel/trace/trace_stack.c | 2 +-
 4 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 35d0760c..99b2f27 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -507,7 +507,6 @@ bail:
 void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 {
 	const struct exception_table_entry *entry;
-	unsigned long *stackend;
 
 	/* Are we prepared to handle this fault?  */
 	if ((entry = search_exception_tables(regs->nip)) != NULL) {
@@ -536,8 +535,7 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 	printk(KERN_ALERT "Faulting instruction address: 0x%08lx\n",
 		regs->nip);
 
-	stackend = end_of_stack(current);
-	if (*stackend != STACK_END_MAGIC)
+	if (task_stack_end_corrupted(current))
 		printk(KERN_ALERT "Thread overran stack, or stack corrupted\n");
 
 	die("Kernel access of bad area", regs, sig);
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index bc23a70..6240bc7 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -648,7 +648,6 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 	   unsigned long address, int signal, int si_code)
 {
 	struct task_struct *tsk = current;
-	unsigned long *stackend;
 	unsigned long flags;
 	int sig;
 
@@ -708,8 +707,7 @@ no_context(struct pt_regs *regs, unsigned long error_code,
 
 	show_fault_oops(regs, error_code, address);
 
-	stackend = end_of_stack(tsk);
-	if (*stackend != STACK_END_MAGIC)
+	if (task_stack_end_corrupted(tsk))
 		printk(KERN_EMERG "Thread overran stack, or stack corrupted\n");
 
 	tsk->thread.cr2		= address;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 7ef34b7..a80e35d 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2615,6 +2615,8 @@ static inline unsigned long *end_of_stack(struct task_struct *p)
 }
 
 #endif
+#define task_stack_end_corrupted(task) \
+		(*(end_of_stack(task)) != STACK_END_MAGIC)
 
 static inline int object_is_on_stack(void *obj)
 {
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 1636e41..16eddb3 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -170,7 +170,7 @@ check_stack(unsigned long ip, unsigned long *stack)
 			i++;
 	}
 
-	if (*end_of_stack(current) != STACK_END_MAGIC) {
+	if (task_stack_end_corrupted(current)) {
 		print_max_stack();
 		BUG();
 	}
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 3/3] sched: BUG when stack end location is over written
  2014-09-11 15:41           ` Aaron Tomlin
@ 2014-09-11 15:41             ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-11 15:41 UTC (permalink / raw)
  To: peterz
  Cc: mingo, mingo, dzickus, bmr, jcastillo, atomlin, oleg, pzijlstr,
	riel, prarit, jgh, linux-kernel, tglx, x86, rostedt, hannes,
	aneesh.kumar, akpm, akpm, linuxppc-dev, minchan

Currently in the event of a stack overrun a call to schedule()
does not check for this type of corruption. This corruption is
often silent and can go unnoticed. However once the corrupted
region is examined at a later stage, the outcome is undefined
and often results in a sporadic page fault which cannot be
handled.

This patch checks for a stack overrun and takes appropriate
action since the damage is already done, there is no point
in continuing.

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 kernel/sched/core.c |  3 +++
 lib/Kconfig.debug   | 12 ++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ec1a286..0b70b73 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2660,6 +2660,9 @@ static noinline void __schedule_bug(struct task_struct *prev)
  */
 static inline void schedule_debug(struct task_struct *prev)
 {
+#ifdef CONFIG_SCHED_STACK_END_CHECK
+	BUG_ON(unlikely(task_stack_end_corrupted(prev)))
+#endif
 	/*
 	 * Test if we are atomic. Since do_exit() needs to call into
 	 * schedule() atomically, we ignore that path. Otherwise whine
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index a285900..2a8280a 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -824,6 +824,18 @@ config SCHEDSTATS
 	  application, you can say N to avoid the very slight overhead
 	  this adds.
 
+config SCHED_STACK_END_CHECK
+	bool "Detect stack corruption on calls to schedule()"
+	depends on DEBUG_KERNEL
+	default y
+	help
+	  This option checks for a stack overrun on calls to schedule().
+	  If the stack end location is found to be over written always panic as
+	  the content of the corrupted region can no longer be trusted.
+	  This is to ensure no erroneous behaviour occurs which could result in
+	  data corruption or a sporadic crash at a later stage once the region
+	  is examined. The runtime overhead introduced is minimal.
+
 config TIMER_STATS
 	bool "Collect kernel timers statistics"
 	depends on DEBUG_KERNEL && PROC_FS
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 74+ messages in thread

* [PATCH v3 3/3] sched: BUG when stack end location is over written
@ 2014-09-11 15:41             ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-11 15:41 UTC (permalink / raw)
  To: peterz
  Cc: dzickus, jcastillo, riel, x86, akpm, minchan, mingo, bmr, prarit,
	oleg, rostedt, linux-kernel, hannes, mingo, aneesh.kumar, akpm,
	atomlin, jgh, linuxppc-dev, tglx, pzijlstr

Currently in the event of a stack overrun a call to schedule()
does not check for this type of corruption. This corruption is
often silent and can go unnoticed. However once the corrupted
region is examined at a later stage, the outcome is undefined
and often results in a sporadic page fault which cannot be
handled.

This patch checks for a stack overrun and takes appropriate
action since the damage is already done, there is no point
in continuing.

Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
---
 kernel/sched/core.c |  3 +++
 lib/Kconfig.debug   | 12 ++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index ec1a286..0b70b73 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2660,6 +2660,9 @@ static noinline void __schedule_bug(struct task_struct *prev)
  */
 static inline void schedule_debug(struct task_struct *prev)
 {
+#ifdef CONFIG_SCHED_STACK_END_CHECK
+	BUG_ON(unlikely(task_stack_end_corrupted(prev)))
+#endif
 	/*
 	 * Test if we are atomic. Since do_exit() needs to call into
 	 * schedule() atomically, we ignore that path. Otherwise whine
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index a285900..2a8280a 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -824,6 +824,18 @@ config SCHEDSTATS
 	  application, you can say N to avoid the very slight overhead
 	  this adds.
 
+config SCHED_STACK_END_CHECK
+	bool "Detect stack corruption on calls to schedule()"
+	depends on DEBUG_KERNEL
+	default y
+	help
+	  This option checks for a stack overrun on calls to schedule().
+	  If the stack end location is found to be over written always panic as
+	  the content of the corrupted region can no longer be trusted.
+	  This is to ensure no erroneous behaviour occurs which could result in
+	  data corruption or a sporadic crash at a later stage once the region
+	  is examined. The runtime overhead introduced is minimal.
+
 config TIMER_STATS
 	bool "Collect kernel timers statistics"
 	depends on DEBUG_KERNEL && PROC_FS
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 0/3] sched: Always check the integrity of the canary
  2014-09-11 15:41           ` Aaron Tomlin
@ 2014-09-11 15:53             ` Peter Zijlstra
  -1 siblings, 0 replies; 74+ messages in thread
From: Peter Zijlstra @ 2014-09-11 15:53 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: mingo, mingo, dzickus, bmr, jcastillo, oleg, pzijlstr, riel,
	prarit, jgh, linux-kernel, tglx, x86, rostedt, hannes,
	aneesh.kumar, akpm, akpm, linuxppc-dev, minchan


What's with the threading all versions together? Please don't do that --
also don't post a new version just for this though.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 0/3] sched: Always check the integrity of the canary
@ 2014-09-11 15:53             ` Peter Zijlstra
  0 siblings, 0 replies; 74+ messages in thread
From: Peter Zijlstra @ 2014-09-11 15:53 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: dzickus, jcastillo, riel, x86, akpm, mingo, bmr, prarit, oleg,
	rostedt, linux-kernel, minchan, mingo, aneesh.kumar, akpm,
	hannes, jgh, linuxppc-dev, tglx, pzijlstr


What's with the threading all versions together? Please don't do that --
also don't post a new version just for this though.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 0/3] sched: Always check the integrity of the canary
  2014-09-11 15:53             ` Peter Zijlstra
@ 2014-09-11 15:59               ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-11 15:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mingo, mingo, dzickus, bmr, jcastillo, oleg, pzijlstr, riel,
	prarit, jgh, linux-kernel, tglx, x86, rostedt, hannes,
	aneesh.kumar, akpm, akpm, linuxppc-dev, minchan

On Thu, Sep 11, 2014 at 05:53:03PM +0200, Peter Zijlstra wrote:
> 
> What's with the threading all versions together? Please don't do that --
> also don't post a new version just for this though.

Sorry about that. Noted.

-- 
Aaron Tomlin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 0/3] sched: Always check the integrity of the canary
@ 2014-09-11 15:59               ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-11 15:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: dzickus, jcastillo, riel, x86, akpm, mingo, bmr, prarit, oleg,
	rostedt, linux-kernel, minchan, mingo, aneesh.kumar, akpm,
	hannes, jgh, linuxppc-dev, tglx, pzijlstr

On Thu, Sep 11, 2014 at 05:53:03PM +0200, Peter Zijlstra wrote:
> 
> What's with the threading all versions together? Please don't do that --
> also don't post a new version just for this though.

Sorry about that. Noted.

-- 
Aaron Tomlin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH v3 0/3] sched: Always check the integrity of the canary
  2014-09-11 15:41           ` Aaron Tomlin
@ 2014-09-11 16:02             ` David Laight
  -1 siblings, 0 replies; 74+ messages in thread
From: David Laight @ 2014-09-11 16:02 UTC (permalink / raw)
  To: 'Aaron Tomlin', peterz
  Cc: dzickus, jcastillo, riel, x86, akpm, minchan, mingo, bmr, prarit,
	oleg, rostedt, linux-kernel, hannes, mingo, aneesh.kumar, akpm,
	jgh, linuxppc-dev, tglx, pzijlstr

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1330 bytes --]

From: Aaron Tomlin
> Currently in the event of a stack overrun a call to schedule()
> does not check for this type of corruption. This corruption is
> often silent and can go unnoticed. However once the corrupted
> region is examined at a later stage, the outcome is undefined
> and often results in a sporadic page fault which cannot be
> handled.
> 
> The first patch adds a canary to init_task's end of stack.
> While the second patch provides a helper to determine the
> integrity of the canary. The third checks for a stack
> overrun and takes appropriate action since the damage
> is already done, there is no point in continuing.

Clearly you've just been 'bitten' by a kernel stack overflow.
But a simple 'canary' isn't going to find most of the overflows
and will give an incorrect 'sense of security'.

The canary will only work if the stack is densely written.
In practise the stack alignment rules create gaps, and the
most likely reason for overflow is a large on-stack buffer
that isn't actually written to.

The only real way to detect kernel stack overflow is to arrange
for an unmapped page beyond the stack.
That costs KVA, but not much else.

	David

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH v3 0/3] sched: Always check the integrity of the canary
@ 2014-09-11 16:02             ` David Laight
  0 siblings, 0 replies; 74+ messages in thread
From: David Laight @ 2014-09-11 16:02 UTC (permalink / raw)
  To: 'Aaron Tomlin', peterz
  Cc: dzickus, jcastillo, riel, prarit, pzijlstr, mingo, bmr, x86,
	oleg, rostedt, linux-kernel, minchan, mingo, tglx, aneesh.kumar,
	hannes, akpm, linuxppc-dev, jgh, akpm

RnJvbTogQWFyb24gVG9tbGluDQo+IEN1cnJlbnRseSBpbiB0aGUgZXZlbnQgb2YgYSBzdGFjayBv
dmVycnVuIGEgY2FsbCB0byBzY2hlZHVsZSgpDQo+IGRvZXMgbm90IGNoZWNrIGZvciB0aGlzIHR5
cGUgb2YgY29ycnVwdGlvbi4gVGhpcyBjb3JydXB0aW9uIGlzDQo+IG9mdGVuIHNpbGVudCBhbmQg
Y2FuIGdvIHVubm90aWNlZC4gSG93ZXZlciBvbmNlIHRoZSBjb3JydXB0ZWQNCj4gcmVnaW9uIGlz
IGV4YW1pbmVkIGF0IGEgbGF0ZXIgc3RhZ2UsIHRoZSBvdXRjb21lIGlzIHVuZGVmaW5lZA0KPiBh
bmQgb2Z0ZW4gcmVzdWx0cyBpbiBhIHNwb3JhZGljIHBhZ2UgZmF1bHQgd2hpY2ggY2Fubm90IGJl
DQo+IGhhbmRsZWQuDQo+IA0KPiBUaGUgZmlyc3QgcGF0Y2ggYWRkcyBhIGNhbmFyeSB0byBpbml0
X3Rhc2sncyBlbmQgb2Ygc3RhY2suDQo+IFdoaWxlIHRoZSBzZWNvbmQgcGF0Y2ggcHJvdmlkZXMg
YSBoZWxwZXIgdG8gZGV0ZXJtaW5lIHRoZQ0KPiBpbnRlZ3JpdHkgb2YgdGhlIGNhbmFyeS4gVGhl
IHRoaXJkIGNoZWNrcyBmb3IgYSBzdGFjaw0KPiBvdmVycnVuIGFuZCB0YWtlcyBhcHByb3ByaWF0
ZSBhY3Rpb24gc2luY2UgdGhlIGRhbWFnZQ0KPiBpcyBhbHJlYWR5IGRvbmUsIHRoZXJlIGlzIG5v
IHBvaW50IGluIGNvbnRpbnVpbmcuDQoNCkNsZWFybHkgeW91J3ZlIGp1c3QgYmVlbiAnYml0dGVu
JyBieSBhIGtlcm5lbCBzdGFjayBvdmVyZmxvdy4NCkJ1dCBhIHNpbXBsZSAnY2FuYXJ5JyBpc24n
dCBnb2luZyB0byBmaW5kIG1vc3Qgb2YgdGhlIG92ZXJmbG93cw0KYW5kIHdpbGwgZ2l2ZSBhbiBp
bmNvcnJlY3QgJ3NlbnNlIG9mIHNlY3VyaXR5Jy4NCg0KVGhlIGNhbmFyeSB3aWxsIG9ubHkgd29y
ayBpZiB0aGUgc3RhY2sgaXMgZGVuc2VseSB3cml0dGVuLg0KSW4gcHJhY3Rpc2UgdGhlIHN0YWNr
IGFsaWdubWVudCBydWxlcyBjcmVhdGUgZ2FwcywgYW5kIHRoZQ0KbW9zdCBsaWtlbHkgcmVhc29u
IGZvciBvdmVyZmxvdyBpcyBhIGxhcmdlIG9uLXN0YWNrIGJ1ZmZlcg0KdGhhdCBpc24ndCBhY3R1
YWxseSB3cml0dGVuIHRvLg0KDQpUaGUgb25seSByZWFsIHdheSB0byBkZXRlY3Qga2VybmVsIHN0
YWNrIG92ZXJmbG93IGlzIHRvIGFycmFuZ2UNCmZvciBhbiB1bm1hcHBlZCBwYWdlIGJleW9uZCB0
aGUgc3RhY2suDQpUaGF0IGNvc3RzIEtWQSwgYnV0IG5vdCBtdWNoIGVsc2UuDQoNCglEYXZpZA0K
DQo=

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 0/3] sched: Always check the integrity of the canary
  2014-09-11 16:02             ` David Laight
@ 2014-09-11 17:26               ` Chuck Ebbert
  -1 siblings, 0 replies; 74+ messages in thread
From: Chuck Ebbert @ 2014-09-11 17:26 UTC (permalink / raw)
  To: David Laight
  Cc: 'Aaron Tomlin',
	peterz, dzickus, jcastillo, riel, x86, akpm, minchan, mingo, bmr,
	prarit, oleg, rostedt, linux-kernel, hannes, mingo, aneesh.kumar,
	akpm, jgh, linuxppc-dev, tglx, pzijlstr

On Thu, 11 Sep 2014 16:02:45 +0000
David Laight <David.Laight@ACULAB.COM> wrote:

> From: Aaron Tomlin
> > Currently in the event of a stack overrun a call to schedule()
> > does not check for this type of corruption. This corruption is
> > often silent and can go unnoticed. However once the corrupted
> > region is examined at a later stage, the outcome is undefined
> > and often results in a sporadic page fault which cannot be
> > handled.
> > 
> > The first patch adds a canary to init_task's end of stack.
> > While the second patch provides a helper to determine the
> > integrity of the canary. The third checks for a stack
> > overrun and takes appropriate action since the damage
> > is already done, there is no point in continuing.
> 
> Clearly you've just been 'bitten' by a kernel stack overflow.
> But a simple 'canary' isn't going to find most of the overflows
> and will give an incorrect 'sense of security'.
> 
> The canary will only work if the stack is densely written.
> In practise the stack alignment rules create gaps, and the
> most likely reason for overflow is a large on-stack buffer
> that isn't actually written to.
> 
> The only real way to detect kernel stack overflow is to arrange
> for an unmapped page beyond the stack.
> That costs KVA, but not much else.
> 

That doesn't work either, because the threadinfo sits between the end of the
stack and the beginning of the next page, making it possible to corrupt critical
data without running off the page.


^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 0/3] sched: Always check the integrity of the canary
@ 2014-09-11 17:26               ` Chuck Ebbert
  0 siblings, 0 replies; 74+ messages in thread
From: Chuck Ebbert @ 2014-09-11 17:26 UTC (permalink / raw)
  To: David Laight
  Cc: jcastillo, akpm, peterz, bmr, linux-kernel, prarit, x86, mingo,
	'Aaron Tomlin',
	dzickus, riel, mingo, rostedt, tglx, jgh, pzijlstr, oleg,
	minchan, aneesh.kumar, hannes, akpm, linuxppc-dev

On Thu, 11 Sep 2014 16:02:45 +0000
David Laight <David.Laight@ACULAB.COM> wrote:

> From: Aaron Tomlin
> > Currently in the event of a stack overrun a call to schedule()
> > does not check for this type of corruption. This corruption is
> > often silent and can go unnoticed. However once the corrupted
> > region is examined at a later stage, the outcome is undefined
> > and often results in a sporadic page fault which cannot be
> > handled.
> > 
> > The first patch adds a canary to init_task's end of stack.
> > While the second patch provides a helper to determine the
> > integrity of the canary. The third checks for a stack
> > overrun and takes appropriate action since the damage
> > is already done, there is no point in continuing.
> 
> Clearly you've just been 'bitten' by a kernel stack overflow.
> But a simple 'canary' isn't going to find most of the overflows
> and will give an incorrect 'sense of security'.
> 
> The canary will only work if the stack is densely written.
> In practise the stack alignment rules create gaps, and the
> most likely reason for overflow is a large on-stack buffer
> that isn't actually written to.
> 
> The only real way to detect kernel stack overflow is to arrange
> for an unmapped page beyond the stack.
> That costs KVA, but not much else.
> 

That doesn't work either, because the threadinfo sits between the end of the
stack and the beginning of the next page, making it possible to corrupt critical
data without running off the page.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 0/3] sched: Always check the integrity of the canary
  2014-09-11 16:02             ` David Laight
@ 2014-09-11 17:44               ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-11 17:44 UTC (permalink / raw)
  To: David Laight
  Cc: peterz, dzickus, jcastillo, riel, x86, akpm, minchan, mingo, bmr,
	prarit, oleg, rostedt, linux-kernel, hannes, mingo, aneesh.kumar,
	akpm, jgh, linuxppc-dev, tglx, pzijlstr

On Thu, Sep 11, 2014 at 04:02:45PM +0000, David Laight wrote:
> From: Aaron Tomlin
> > Currently in the event of a stack overrun a call to schedule()
> > does not check for this type of corruption. This corruption is
> > often silent and can go unnoticed. However once the corrupted
> > region is examined at a later stage, the outcome is undefined
> > and often results in a sporadic page fault which cannot be
> > handled.
> > 
> > The first patch adds a canary to init_task's end of stack.
> > While the second patch provides a helper to determine the
> > integrity of the canary. The third checks for a stack
> > overrun and takes appropriate action since the damage
> > is already done, there is no point in continuing.
> 
> Clearly you've just been 'bitten' by a kernel stack overflow.
> But a simple 'canary' isn't going to find most of the overflows
> and will give an incorrect 'sense of security'.

Please note that this is not suppose to be a 'perfect' solution.
Rather a worth while check in this particular code path.
Let's assume that the canary is damaged. In this situation it is
rather likely that the thread_info object has been compromised too.

-- 
Aaron Tomlin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 0/3] sched: Always check the integrity of the canary
@ 2014-09-11 17:44               ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-11 17:44 UTC (permalink / raw)
  To: David Laight
  Cc: dzickus, jcastillo, riel, prarit, mingo, pzijlstr, peterz, bmr,
	x86, oleg, rostedt, linux-kernel, minchan, mingo, tglx,
	aneesh.kumar, hannes, akpm, linuxppc-dev, jgh, akpm

On Thu, Sep 11, 2014 at 04:02:45PM +0000, David Laight wrote:
> From: Aaron Tomlin
> > Currently in the event of a stack overrun a call to schedule()
> > does not check for this type of corruption. This corruption is
> > often silent and can go unnoticed. However once the corrupted
> > region is examined at a later stage, the outcome is undefined
> > and often results in a sporadic page fault which cannot be
> > handled.
> > 
> > The first patch adds a canary to init_task's end of stack.
> > While the second patch provides a helper to determine the
> > integrity of the canary. The third checks for a stack
> > overrun and takes appropriate action since the damage
> > is already done, there is no point in continuing.
> 
> Clearly you've just been 'bitten' by a kernel stack overflow.
> But a simple 'canary' isn't going to find most of the overflows
> and will give an incorrect 'sense of security'.

Please note that this is not suppose to be a 'perfect' solution.
Rather a worth while check in this particular code path.
Let's assume that the canary is damaged. In this situation it is
rather likely that the thread_info object has been compromised too.

-- 
Aaron Tomlin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 3/3] sched: BUG when stack end location is over written
  2014-09-11 15:41             ` Aaron Tomlin
@ 2014-09-12  4:06               ` Michael Ellerman
  -1 siblings, 0 replies; 74+ messages in thread
From: Michael Ellerman @ 2014-09-12  4:06 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: peterz, dzickus, jcastillo, riel, x86, akpm, minchan, mingo, bmr,
	prarit, oleg, rostedt, linux-kernel, hannes, mingo, aneesh.kumar,
	akpm, jgh, linuxppc-dev, tglx, pzijlstr

On Thu, 2014-09-11 at 16:41 +0100, Aaron Tomlin wrote:
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index a285900..2a8280a 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -824,6 +824,18 @@ config SCHEDSTATS
>  	  application, you can say N to avoid the very slight overhead
>  	  this adds.
>  
> +config SCHED_STACK_END_CHECK
> +	bool "Detect stack corruption on calls to schedule()"
> +	depends on DEBUG_KERNEL
> +	default y

Did you really mean default y?

Doing so means it will be turned on more or less everywhere, which defeats the
purpose of having a config option in the first place.

> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index ec1a286..0b70b73 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2660,6 +2660,9 @@ static noinline void __schedule_bug(struct task_struct *prev)
>   */
>  static inline void schedule_debug(struct task_struct *prev)
>  {
> +#ifdef CONFIG_SCHED_STACK_END_CHECK
> +	BUG_ON(unlikely(task_stack_end_corrupted(prev)))
> +#endif

If this was my code I'd make you put that in a static inline.

cheers




^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 3/3] sched: BUG when stack end location is over written
@ 2014-09-12  4:06               ` Michael Ellerman
  0 siblings, 0 replies; 74+ messages in thread
From: Michael Ellerman @ 2014-09-12  4:06 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: dzickus, jcastillo, riel, prarit, mingo, pzijlstr, peterz, bmr,
	x86, oleg, rostedt, linux-kernel, minchan, mingo, tglx,
	aneesh.kumar, hannes, akpm, linuxppc-dev, jgh, akpm

On Thu, 2014-09-11 at 16:41 +0100, Aaron Tomlin wrote:
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index a285900..2a8280a 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -824,6 +824,18 @@ config SCHEDSTATS
>  	  application, you can say N to avoid the very slight overhead
>  	  this adds.
>  
> +config SCHED_STACK_END_CHECK
> +	bool "Detect stack corruption on calls to schedule()"
> +	depends on DEBUG_KERNEL
> +	default y

Did you really mean default y?

Doing so means it will be turned on more or less everywhere, which defeats the
purpose of having a config option in the first place.

> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index ec1a286..0b70b73 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2660,6 +2660,9 @@ static noinline void __schedule_bug(struct task_struct *prev)
>   */
>  static inline void schedule_debug(struct task_struct *prev)
>  {
> +#ifdef CONFIG_SCHED_STACK_END_CHECK
> +	BUG_ON(unlikely(task_stack_end_corrupted(prev)))
> +#endif

If this was my code I'd make you put that in a static inline.

cheers

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 3/3] sched: BUG when stack end location is over written
  2014-09-11 15:41             ` Aaron Tomlin
@ 2014-09-12  6:04               ` Michael Ellerman
  -1 siblings, 0 replies; 74+ messages in thread
From: Michael Ellerman @ 2014-09-12  6:04 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: peterz, dzickus, jcastillo, riel, x86, akpm, minchan, bmr,
	prarit, oleg, rostedt, linux-kernel, hannes, mingo, aneesh.kumar,
	akpm, jgh, linuxppc-dev, tglx, pzijlstr

On Thu, 2014-09-11 at 16:41 +0100, Aaron Tomlin wrote:
> Currently in the event of a stack overrun a call to schedule()
> does not check for this type of corruption. This corruption is
> often silent and can go unnoticed. However once the corrupted
> region is examined at a later stage, the outcome is undefined
> and often results in a sporadic page fault which cannot be
> handled.
> 
> This patch checks for a stack overrun and takes appropriate
> action since the damage is already done, there is no point
> in continuing.
> 
> Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
> ---
>  kernel/sched/core.c |  3 +++
>  lib/Kconfig.debug   | 12 ++++++++++++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index ec1a286..0b70b73 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2660,6 +2660,9 @@ static noinline void __schedule_bug(struct task_struct *prev)
>   */
>  static inline void schedule_debug(struct task_struct *prev)
>  {
> +#ifdef CONFIG_SCHED_STACK_END_CHECK
> +	BUG_ON(unlikely(task_stack_end_corrupted(prev)))
> +#endif

Spot the bug? Please compile your code in future.


../kernel/sched/core.c: In function ‘schedule_debug’:
../kernel/sched/core.c:2671:2: error: expected ‘;’ before ‘if’
  if (unlikely(in_atomic_preempt_off() && prev->state != TASK_DEAD))
  ^
../kernel/sched/core.c: At top level:
../kernel/sched/core.c:2635:22: warning: ‘__schedule_bug’ defined but not used [-Wunused-function]
 static noinline void __schedule_bug(struct task_struct *prev)
                      ^
make[3]: *** [kernel/sched/core.o] Error 1
make[2]: *** [kernel/sched] Error 2
make[1]: *** [kernel] Error 2
make: *** [sub-make] Error 2


cheers



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 3/3] sched: BUG when stack end location is over written
@ 2014-09-12  6:04               ` Michael Ellerman
  0 siblings, 0 replies; 74+ messages in thread
From: Michael Ellerman @ 2014-09-12  6:04 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: dzickus, jcastillo, riel, prarit, pzijlstr, peterz, bmr, x86,
	oleg, rostedt, linux-kernel, minchan, mingo, tglx, aneesh.kumar,
	hannes, akpm, linuxppc-dev, jgh, akpm

On Thu, 2014-09-11 at 16:41 +0100, Aaron Tomlin wrote:
> Currently in the event of a stack overrun a call to schedule()
> does not check for this type of corruption. This corruption is
> often silent and can go unnoticed. However once the corrupted
> region is examined at a later stage, the outcome is undefined
> and often results in a sporadic page fault which cannot be
> handled.
> 
> This patch checks for a stack overrun and takes appropriate
> action since the damage is already done, there is no point
> in continuing.
> 
> Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
> ---
>  kernel/sched/core.c |  3 +++
>  lib/Kconfig.debug   | 12 ++++++++++++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index ec1a286..0b70b73 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2660,6 +2660,9 @@ static noinline void __schedule_bug(struct task_struct *prev)
>   */
>  static inline void schedule_debug(struct task_struct *prev)
>  {
> +#ifdef CONFIG_SCHED_STACK_END_CHECK
> +	BUG_ON(unlikely(task_stack_end_corrupted(prev)))
> +#endif

Spot the bug? Please compile your code in future.


../kernel/sched/core.c: In function ‘schedule_debug’:
../kernel/sched/core.c:2671:2: error: expected ‘;’ before ‘if’
  if (unlikely(in_atomic_preempt_off() && prev->state != TASK_DEAD))
  ^
../kernel/sched/core.c: At top level:
../kernel/sched/core.c:2635:22: warning: ‘__schedule_bug’ defined but not used [-Wunused-function]
 static noinline void __schedule_bug(struct task_struct *prev)
                      ^
make[3]: *** [kernel/sched/core.o] Error 1
make[2]: *** [kernel/sched] Error 2
make[1]: *** [kernel] Error 2
make: *** [sub-make] Error 2


cheers

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 1/3] init/main.c: Give init_task a canary
  2014-09-11 15:41             ` Aaron Tomlin
@ 2014-09-12  7:28               ` Michael Ellerman
  -1 siblings, 0 replies; 74+ messages in thread
From: Michael Ellerman @ 2014-09-12  7:28 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: peterz, dzickus, jcastillo, riel, x86, akpm, minchan, bmr,
	prarit, oleg, rostedt, linux-kernel, hannes, mingo, aneesh.kumar,
	akpm, jgh, linuxppc-dev, tglx, pzijlstr

On Thu, 2014-09-11 at 16:41 +0100, Aaron Tomlin wrote:
> Tasks get their end of stack set to STACK_END_MAGIC with the
> aim to catch stack overruns. Currently this feature does not
> apply to init_task. This patch removes this restriction.
> 
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 51ab9e7..35d0760c 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -30,7 +30,6 @@
>  #include <linux/kprobes.h>
>  #include <linux/kdebug.h>
>  #include <linux/perf_event.h>
> -#include <linux/magic.h>
>  #include <linux/ratelimit.h>
>  #include <linux/context_tracking.h>
>  
> @@ -538,7 +537,7 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
>  		regs->nip);
>  
>  	stackend = end_of_stack(current);
> -	if (current != &init_task && *stackend != STACK_END_MAGIC)
> +	if (*stackend != STACK_END_MAGIC)
>  		printk(KERN_ALERT "Thread overran stack, or stack corrupted\n");

This part looks fine.

Acked-by: Michael Ellerman <mpe@ellerman.id.au>

cheers



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 1/3] init/main.c: Give init_task a canary
@ 2014-09-12  7:28               ` Michael Ellerman
  0 siblings, 0 replies; 74+ messages in thread
From: Michael Ellerman @ 2014-09-12  7:28 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: dzickus, jcastillo, riel, prarit, pzijlstr, peterz, bmr, x86,
	oleg, rostedt, linux-kernel, minchan, mingo, tglx, aneesh.kumar,
	hannes, akpm, linuxppc-dev, jgh, akpm

On Thu, 2014-09-11 at 16:41 +0100, Aaron Tomlin wrote:
> Tasks get their end of stack set to STACK_END_MAGIC with the
> aim to catch stack overruns. Currently this feature does not
> apply to init_task. This patch removes this restriction.
> 
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 51ab9e7..35d0760c 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -30,7 +30,6 @@
>  #include <linux/kprobes.h>
>  #include <linux/kdebug.h>
>  #include <linux/perf_event.h>
> -#include <linux/magic.h>
>  #include <linux/ratelimit.h>
>  #include <linux/context_tracking.h>
>  
> @@ -538,7 +537,7 @@ void bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
>  		regs->nip);
>  
>  	stackend = end_of_stack(current);
> -	if (current != &init_task && *stackend != STACK_END_MAGIC)
> +	if (*stackend != STACK_END_MAGIC)
>  		printk(KERN_ALERT "Thread overran stack, or stack corrupted\n");

This part looks fine.

Acked-by: Michael Ellerman <mpe@ellerman.id.au>

cheers

^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH v3 0/3] sched: Always check the integrity of the canary
  2014-09-11 17:26               ` Chuck Ebbert
@ 2014-09-12  8:43                 ` David Laight
  -1 siblings, 0 replies; 74+ messages in thread
From: David Laight @ 2014-09-12  8:43 UTC (permalink / raw)
  To: 'Chuck Ebbert'
  Cc: 'Aaron Tomlin',
	peterz, dzickus, jcastillo, riel, x86, akpm, minchan, mingo, bmr,
	prarit, oleg, rostedt, linux-kernel, hannes, mingo, aneesh.kumar,
	akpm, jgh, linuxppc-dev, tglx, pzijlstr

From: Chuck Ebbert 
> David Laight <David.Laight@ACULAB.COM> wrote:
> 
> > From: Aaron Tomlin
> > > Currently in the event of a stack overrun a call to schedule()
> > > does not check for this type of corruption. This corruption is
> > > often silent and can go unnoticed. However once the corrupted
> > > region is examined at a later stage, the outcome is undefined
> > > and often results in a sporadic page fault which cannot be
> > > handled.
> > >
> > > The first patch adds a canary to init_task's end of stack.
> > > While the second patch provides a helper to determine the
> > > integrity of the canary. The third checks for a stack
> > > overrun and takes appropriate action since the damage
> > > is already done, there is no point in continuing.
> >
> > Clearly you've just been 'bitten' by a kernel stack overflow.
> > But a simple 'canary' isn't going to find most of the overflows
> > and will give an incorrect 'sense of security'.
> >
> > The canary will only work if the stack is densely written.
> > In practise the stack alignment rules create gaps, and the
> > most likely reason for overflow is a large on-stack buffer
> > that isn't actually written to.
> >
> > The only real way to detect kernel stack overflow is to arrange
> > for an unmapped page beyond the stack.
> > That costs KVA, but not much else.
> >
> 
> That doesn't work either, because the threadinfo sits between the end of the
> stack and the beginning of the next page, making it possible to corrupt critical
> data without running off the page.

Then flip the order of the allocations so that the threadinfo is at the other end.
I'm not sure how many per-lwp structures the linux kernel has, but
I know that on netbsd the main thing the kernel stack hits is the fpu
save area - the end of that is the avx area which won't be needed when
the stack usage is large.
Everything else could be moved from the stack pages to the lwp struct itself.

	David




^ permalink raw reply	[flat|nested] 74+ messages in thread

* RE: [PATCH v3 0/3] sched: Always check the integrity of the canary
@ 2014-09-12  8:43                 ` David Laight
  0 siblings, 0 replies; 74+ messages in thread
From: David Laight @ 2014-09-12  8:43 UTC (permalink / raw)
  To: 'Chuck Ebbert'
  Cc: jcastillo, akpm, peterz, bmr, linux-kernel, prarit, x86, mingo,
	'Aaron Tomlin',
	dzickus, riel, mingo, rostedt, tglx, jgh, pzijlstr, oleg,
	minchan, aneesh.kumar, hannes, akpm, linuxppc-dev

From: Chuck Ebbert=20
> David Laight <David.Laight@ACULAB.COM> wrote:
>=20
> > From: Aaron Tomlin
> > > Currently in the event of a stack overrun a call to schedule()
> > > does not check for this type of corruption. This corruption is
> > > often silent and can go unnoticed. However once the corrupted
> > > region is examined at a later stage, the outcome is undefined
> > > and often results in a sporadic page fault which cannot be
> > > handled.
> > >
> > > The first patch adds a canary to init_task's end of stack.
> > > While the second patch provides a helper to determine the
> > > integrity of the canary. The third checks for a stack
> > > overrun and takes appropriate action since the damage
> > > is already done, there is no point in continuing.
> >
> > Clearly you've just been 'bitten' by a kernel stack overflow.
> > But a simple 'canary' isn't going to find most of the overflows
> > and will give an incorrect 'sense of security'.
> >
> > The canary will only work if the stack is densely written.
> > In practise the stack alignment rules create gaps, and the
> > most likely reason for overflow is a large on-stack buffer
> > that isn't actually written to.
> >
> > The only real way to detect kernel stack overflow is to arrange
> > for an unmapped page beyond the stack.
> > That costs KVA, but not much else.
> >
>=20
> That doesn't work either, because the threadinfo sits between the end of =
the
> stack and the beginning of the next page, making it possible to corrupt c=
ritical
> data without running off the page.

Then flip the order of the allocations so that the threadinfo is at the oth=
er end.
I'm not sure how many per-lwp structures the linux kernel has, but
I know that on netbsd the main thing the kernel stack hits is the fpu
save area - the end of that is the avx area which won't be needed when
the stack usage is large.
Everything else could be moved from the stack pages to the lwp struct itsel=
f.

	David

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 3/3] sched: BUG when stack end location is over written
  2014-09-12  4:06               ` Michael Ellerman
@ 2014-09-12  9:44                 ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-12  9:44 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: peterz, dzickus, jcastillo, riel, x86, akpm, minchan, bmr,
	prarit, oleg, rostedt, linux-kernel, hannes, mingo, aneesh.kumar,
	akpm, jgh, linuxppc-dev, tglx, pzijlstr

On Fri, Sep 12, 2014 at 02:06:57PM +1000, Michael Ellerman wrote:
> On Thu, 2014-09-11 at 16:41 +0100, Aaron Tomlin wrote:
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index a285900..2a8280a 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -824,6 +824,18 @@ config SCHEDSTATS
> >  	  application, you can say N to avoid the very slight overhead
> >  	  this adds.
> >  
> > +config SCHED_STACK_END_CHECK
> > +	bool "Detect stack corruption on calls to schedule()"
> > +	depends on DEBUG_KERNEL
> > +	default y
> 
> Did you really mean default y?
> 
> Doing so means it will be turned on more or less everywhere, which defeats the
> purpose of having a config option in the first place.

Only if Kconfig CONFIG_DEBUG_KERNEL is enabled in the first place.

-- 
Aaron Tomlin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 3/3] sched: BUG when stack end location is over written
@ 2014-09-12  9:44                 ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-12  9:44 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: dzickus, jcastillo, riel, prarit, pzijlstr, peterz, bmr, x86,
	oleg, rostedt, linux-kernel, minchan, mingo, tglx, aneesh.kumar,
	hannes, akpm, linuxppc-dev, jgh, akpm

On Fri, Sep 12, 2014 at 02:06:57PM +1000, Michael Ellerman wrote:
> On Thu, 2014-09-11 at 16:41 +0100, Aaron Tomlin wrote:
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index a285900..2a8280a 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -824,6 +824,18 @@ config SCHEDSTATS
> >  	  application, you can say N to avoid the very slight overhead
> >  	  this adds.
> >  
> > +config SCHED_STACK_END_CHECK
> > +	bool "Detect stack corruption on calls to schedule()"
> > +	depends on DEBUG_KERNEL
> > +	default y
> 
> Did you really mean default y?
> 
> Doing so means it will be turned on more or less everywhere, which defeats the
> purpose of having a config option in the first place.

Only if Kconfig CONFIG_DEBUG_KERNEL is enabled in the first place.

-- 
Aaron Tomlin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 3/3] sched: BUG when stack end location is over written
  2014-09-12  6:04               ` Michael Ellerman
@ 2014-09-12  9:50                 ` Aaron Tomlin
  -1 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-12  9:50 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: peterz, dzickus, jcastillo, riel, x86, akpm, minchan, bmr,
	prarit, oleg, rostedt, linux-kernel, hannes, mingo, aneesh.kumar,
	akpm, jgh, linuxppc-dev, tglx

On Fri, Sep 12, 2014 at 04:04:51PM +1000, Michael Ellerman wrote:
> On Thu, 2014-09-11 at 16:41 +0100, Aaron Tomlin wrote:
> > Currently in the event of a stack overrun a call to schedule()
> > does not check for this type of corruption. This corruption is
> > often silent and can go unnoticed. However once the corrupted
> > region is examined at a later stage, the outcome is undefined
> > and often results in a sporadic page fault which cannot be
> > handled.
> > 
> > This patch checks for a stack overrun and takes appropriate
> > action since the damage is already done, there is no point
> > in continuing.
> > 
> > Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
> > ---
> >  kernel/sched/core.c |  3 +++
> >  lib/Kconfig.debug   | 12 ++++++++++++
> >  2 files changed, 15 insertions(+)
> > 
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index ec1a286..0b70b73 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -2660,6 +2660,9 @@ static noinline void __schedule_bug(struct task_struct *prev)
> >   */
> >  static inline void schedule_debug(struct task_struct *prev)
> >  {
> > +#ifdef CONFIG_SCHED_STACK_END_CHECK
> > +	BUG_ON(unlikely(task_stack_end_corrupted(prev)))
> > +#endif
> 
> Spot the bug? Please compile your code in future.

Oops! Sorry about that.

-- 
Aaron Tomlin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 3/3] sched: BUG when stack end location is over written
@ 2014-09-12  9:50                 ` Aaron Tomlin
  0 siblings, 0 replies; 74+ messages in thread
From: Aaron Tomlin @ 2014-09-12  9:50 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: dzickus, jcastillo, riel, prarit, peterz, bmr, x86, oleg,
	rostedt, linux-kernel, minchan, mingo, tglx, aneesh.kumar,
	hannes, akpm, linuxppc-dev, jgh, akpm

On Fri, Sep 12, 2014 at 04:04:51PM +1000, Michael Ellerman wrote:
> On Thu, 2014-09-11 at 16:41 +0100, Aaron Tomlin wrote:
> > Currently in the event of a stack overrun a call to schedule()
> > does not check for this type of corruption. This corruption is
> > often silent and can go unnoticed. However once the corrupted
> > region is examined at a later stage, the outcome is undefined
> > and often results in a sporadic page fault which cannot be
> > handled.
> > 
> > This patch checks for a stack overrun and takes appropriate
> > action since the damage is already done, there is no point
> > in continuing.
> > 
> > Signed-off-by: Aaron Tomlin <atomlin@redhat.com>
> > ---
> >  kernel/sched/core.c |  3 +++
> >  lib/Kconfig.debug   | 12 ++++++++++++
> >  2 files changed, 15 insertions(+)
> > 
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index ec1a286..0b70b73 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -2660,6 +2660,9 @@ static noinline void __schedule_bug(struct task_struct *prev)
> >   */
> >  static inline void schedule_debug(struct task_struct *prev)
> >  {
> > +#ifdef CONFIG_SCHED_STACK_END_CHECK
> > +	BUG_ON(unlikely(task_stack_end_corrupted(prev)))
> > +#endif
> 
> Spot the bug? Please compile your code in future.

Oops! Sorry about that.

-- 
Aaron Tomlin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 3/3] sched: BUG when stack end location is over written
  2014-09-12  9:44                 ` Aaron Tomlin
@ 2014-09-12 10:58                   ` Mike Galbraith
  -1 siblings, 0 replies; 74+ messages in thread
From: Mike Galbraith @ 2014-09-12 10:58 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: Michael Ellerman, peterz, dzickus, jcastillo, riel, x86, akpm,
	minchan, bmr, prarit, oleg, rostedt, linux-kernel, hannes, mingo,
	aneesh.kumar, akpm, jgh, linuxppc-dev, tglx, pzijlstr

On Fri, 2014-09-12 at 10:44 +0100, Aaron Tomlin wrote: 
> On Fri, Sep 12, 2014 at 02:06:57PM +1000, Michael Ellerman wrote:
> > On Thu, 2014-09-11 at 16:41 +0100, Aaron Tomlin wrote:
> > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > > index a285900..2a8280a 100644
> > > --- a/lib/Kconfig.debug
> > > +++ b/lib/Kconfig.debug
> > > @@ -824,6 +824,18 @@ config SCHEDSTATS
> > >  	  application, you can say N to avoid the very slight overhead
> > >  	  this adds.
> > >  
> > > +config SCHED_STACK_END_CHECK
> > > +	bool "Detect stack corruption on calls to schedule()"
> > > +	depends on DEBUG_KERNEL
> > > +	default y
> > 
> > Did you really mean default y?
> > 
> > Doing so means it will be turned on more or less everywhere, which defeats the
> > purpose of having a config option in the first place.
> 
> Only if Kconfig CONFIG_DEBUG_KERNEL is enabled in the first place.

Which is likely enabled just about everywhere on the planet.

-Mike



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 3/3] sched: BUG when stack end location is over written
@ 2014-09-12 10:58                   ` Mike Galbraith
  0 siblings, 0 replies; 74+ messages in thread
From: Mike Galbraith @ 2014-09-12 10:58 UTC (permalink / raw)
  To: Aaron Tomlin
  Cc: dzickus, jcastillo, riel, prarit, pzijlstr, peterz, bmr, x86,
	oleg, rostedt, linux-kernel, minchan, mingo, tglx, aneesh.kumar,
	hannes, akpm, linuxppc-dev, jgh, akpm

On Fri, 2014-09-12 at 10:44 +0100, Aaron Tomlin wrote: 
> On Fri, Sep 12, 2014 at 02:06:57PM +1000, Michael Ellerman wrote:
> > On Thu, 2014-09-11 at 16:41 +0100, Aaron Tomlin wrote:
> > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > > index a285900..2a8280a 100644
> > > --- a/lib/Kconfig.debug
> > > +++ b/lib/Kconfig.debug
> > > @@ -824,6 +824,18 @@ config SCHEDSTATS
> > >  	  application, you can say N to avoid the very slight overhead
> > >  	  this adds.
> > >  
> > > +config SCHED_STACK_END_CHECK
> > > +	bool "Detect stack corruption on calls to schedule()"
> > > +	depends on DEBUG_KERNEL
> > > +	default y
> > 
> > Did you really mean default y?
> > 
> > Doing so means it will be turned on more or less everywhere, which defeats the
> > purpose of having a config option in the first place.
> 
> Only if Kconfig CONFIG_DEBUG_KERNEL is enabled in the first place.

Which is likely enabled just about everywhere on the planet.

-Mike

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 3/3] sched: BUG when stack end location is over written
  2014-09-12 10:58                   ` Mike Galbraith
@ 2014-09-15  2:39                     ` Michael Ellerman
  -1 siblings, 0 replies; 74+ messages in thread
From: Michael Ellerman @ 2014-09-15  2:39 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Aaron Tomlin, peterz, dzickus, jcastillo, riel, x86, akpm,
	minchan, bmr, prarit, oleg, rostedt, linux-kernel, hannes, mingo,
	aneesh.kumar, akpm, jgh, linuxppc-dev, tglx, pzijlstr

On Fri, 2014-09-12 at 12:58 +0200, Mike Galbraith wrote:
> On Fri, 2014-09-12 at 10:44 +0100, Aaron Tomlin wrote: 
> > On Fri, Sep 12, 2014 at 02:06:57PM +1000, Michael Ellerman wrote:
> > > On Thu, 2014-09-11 at 16:41 +0100, Aaron Tomlin wrote:
> > > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > > > index a285900..2a8280a 100644
> > > > --- a/lib/Kconfig.debug
> > > > +++ b/lib/Kconfig.debug
> > > > @@ -824,6 +824,18 @@ config SCHEDSTATS
> > > >  	  application, you can say N to avoid the very slight overhead
> > > >  	  this adds.
> > > >  
> > > > +config SCHED_STACK_END_CHECK
> > > > +	bool "Detect stack corruption on calls to schedule()"
> > > > +	depends on DEBUG_KERNEL
> > > > +	default y
> > > 
> > > Did you really mean default y?
> > > 
> > > Doing so means it will be turned on more or less everywhere, which defeats the
> > > purpose of having a config option in the first place.
> > 
> > Only if Kconfig CONFIG_DEBUG_KERNEL is enabled in the first place.
> 
> Which is likely enabled just about everywhere on the planet.

Yes, including all distros I'm aware of.

cheers



^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [PATCH v3 3/3] sched: BUG when stack end location is over written
@ 2014-09-15  2:39                     ` Michael Ellerman
  0 siblings, 0 replies; 74+ messages in thread
From: Michael Ellerman @ 2014-09-15  2:39 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: dzickus, jcastillo, riel, prarit, pzijlstr, peterz, bmr, x86,
	oleg, rostedt, linux-kernel, hannes, minchan, mingo, tglx,
	aneesh.kumar, Aaron Tomlin, akpm, linuxppc-dev, jgh, akpm

On Fri, 2014-09-12 at 12:58 +0200, Mike Galbraith wrote:
> On Fri, 2014-09-12 at 10:44 +0100, Aaron Tomlin wrote: 
> > On Fri, Sep 12, 2014 at 02:06:57PM +1000, Michael Ellerman wrote:
> > > On Thu, 2014-09-11 at 16:41 +0100, Aaron Tomlin wrote:
> > > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > > > index a285900..2a8280a 100644
> > > > --- a/lib/Kconfig.debug
> > > > +++ b/lib/Kconfig.debug
> > > > @@ -824,6 +824,18 @@ config SCHEDSTATS
> > > >  	  application, you can say N to avoid the very slight overhead
> > > >  	  this adds.
> > > >  
> > > > +config SCHED_STACK_END_CHECK
> > > > +	bool "Detect stack corruption on calls to schedule()"
> > > > +	depends on DEBUG_KERNEL
> > > > +	default y
> > > 
> > > Did you really mean default y?
> > > 
> > > Doing so means it will be turned on more or less everywhere, which defeats the
> > > purpose of having a config option in the first place.
> > 
> > Only if Kconfig CONFIG_DEBUG_KERNEL is enabled in the first place.
> 
> Which is likely enabled just about everywhere on the planet.

Yes, including all distros I'm aware of.

cheers

^ permalink raw reply	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2014-09-15  2:39 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-04 14:50 [PATCH 0/2] sched: Always check the integrity of the canary Aaron Tomlin
2014-09-04 14:50 ` Aaron Tomlin
2014-09-04 14:50 ` [PATCH 1/2] sched: Add helper for task stack page overrun checking Aaron Tomlin
2014-09-04 14:50   ` Aaron Tomlin
2014-09-04 15:02   ` Oleg Nesterov
2014-09-04 15:02     ` Oleg Nesterov
2014-09-04 15:52     ` Aaron Tomlin
2014-09-04 15:52       ` Aaron Tomlin
2014-09-04 15:30   ` Peter Zijlstra
2014-09-04 15:30     ` Peter Zijlstra
2014-09-04 14:50 ` [PATCH 2/2] sched: BUG when stack end location is over written Aaron Tomlin
2014-09-04 14:50   ` Aaron Tomlin
2014-09-04 15:32   ` Peter Zijlstra
2014-09-04 15:32     ` Peter Zijlstra
2014-09-04 16:11     ` Aaron Tomlin
2014-09-04 16:11       ` Aaron Tomlin
2014-09-08 19:23       ` [PATCH v2 0/3] sched: Always check the integrity of the canary Aaron Tomlin
2014-09-08 19:23         ` Aaron Tomlin
2014-09-08 19:23         ` [PATCH 1/3] init/main.c: Give init_task a canary Aaron Tomlin
2014-09-08 19:23           ` Aaron Tomlin
2014-09-08 19:23         ` [PATCH 2/3] sched: Add helper for task stack page overrun checking Aaron Tomlin
2014-09-08 19:23           ` Aaron Tomlin
2014-09-08 19:23         ` [PATCH 3/3] sched: BUG when stack end location is over written Aaron Tomlin
2014-09-08 19:23           ` Aaron Tomlin
2014-09-09  9:42       ` [PATCH v2 0/3] sched: Always check the integrity of the canary Aaron Tomlin
2014-09-09  9:42         ` Aaron Tomlin
2014-09-09  9:42         ` [PATCH v2 1/3] init/main.c: Give init_task a canary Aaron Tomlin
2014-09-09  9:42           ` Aaron Tomlin
2014-09-10  7:26           ` Chuck Ebbert
2014-09-10  7:26             ` Chuck Ebbert
2014-09-10 13:29             ` Aaron Tomlin
2014-09-10 13:29               ` Aaron Tomlin
2014-09-11 12:23               ` Chuck Ebbert
2014-09-11 12:23                 ` Chuck Ebbert
2014-09-11 14:47                 ` Aaron Tomlin
2014-09-11 14:47                   ` Aaron Tomlin
2014-09-09  9:42         ` [PATCH v2 2/3] sched: Add helper for task stack page overrun checking Aaron Tomlin
2014-09-09  9:42           ` Aaron Tomlin
2014-09-09  9:42         ` [PATCH v2 3/3] sched: BUG when stack end location is over written Aaron Tomlin
2014-09-09  9:42           ` Aaron Tomlin
2014-09-11 15:41         ` [PATCH v3 0/3] sched: Always check the integrity of the canary Aaron Tomlin
2014-09-11 15:41           ` Aaron Tomlin
2014-09-11 15:41           ` [PATCH v3 1/3] init/main.c: Give init_task a canary Aaron Tomlin
2014-09-11 15:41             ` Aaron Tomlin
2014-09-12  7:28             ` Michael Ellerman
2014-09-12  7:28               ` Michael Ellerman
2014-09-11 15:41           ` [PATCH v3 2/3] sched: Add helper for task stack page overrun checking Aaron Tomlin
2014-09-11 15:41             ` Aaron Tomlin
2014-09-11 15:41           ` [PATCH v3 3/3] sched: BUG when stack end location is over written Aaron Tomlin
2014-09-11 15:41             ` Aaron Tomlin
2014-09-12  4:06             ` Michael Ellerman
2014-09-12  4:06               ` Michael Ellerman
2014-09-12  9:44               ` Aaron Tomlin
2014-09-12  9:44                 ` Aaron Tomlin
2014-09-12 10:58                 ` Mike Galbraith
2014-09-12 10:58                   ` Mike Galbraith
2014-09-15  2:39                   ` Michael Ellerman
2014-09-15  2:39                     ` Michael Ellerman
2014-09-12  6:04             ` Michael Ellerman
2014-09-12  6:04               ` Michael Ellerman
2014-09-12  9:50               ` Aaron Tomlin
2014-09-12  9:50                 ` Aaron Tomlin
2014-09-11 15:53           ` [PATCH v3 0/3] sched: Always check the integrity of the canary Peter Zijlstra
2014-09-11 15:53             ` Peter Zijlstra
2014-09-11 15:59             ` Aaron Tomlin
2014-09-11 15:59               ` Aaron Tomlin
2014-09-11 16:02           ` David Laight
2014-09-11 16:02             ` David Laight
2014-09-11 17:26             ` Chuck Ebbert
2014-09-11 17:26               ` Chuck Ebbert
2014-09-12  8:43               ` David Laight
2014-09-12  8:43                 ` David Laight
2014-09-11 17:44             ` Aaron Tomlin
2014-09-11 17:44               ` Aaron Tomlin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.