linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86: panic when a kernel stack overflow is detected
@ 2019-07-29  1:59 Daniel Axtens
  2019-07-29  3:53 ` Andy Lutomirski
  0 siblings, 1 reply; 3+ messages in thread
From: Daniel Axtens @ 2019-07-29  1:59 UTC (permalink / raw)
  To: kasan-dev, x86, aryabinin, glider, luto, linux-kernel
  Cc: Daniel Axtens, Marco Elver

Currently, when a kernel stack overflow is detected via VMAP_STACK,
the task is killed with die().

This isn't safe, because we don't know how that process has affected
kernel state. In particular, we don't know what locks have been taken.
For example, we can hit a case with lkdtm where a thread takes a
stack overflow in printk() after taking the logbuf_lock. In that case,
we deadlock when the kernel next does a printk.

Do not attempt to kill the process when a kernel stack overflow is
detected. The system state is unknown, the only safe thing to do is
panic(). (panic() also prints without taking locks so a useful debug
splat is printed even when logbuf_lock is held.)

Reported-by: Marco Elver <elver@google.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 arch/x86/kernel/traps.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 4bb0f8447112..bfb0ec667c09 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -301,13 +301,14 @@ __visible void __noreturn handle_stack_overflow(const char *message,
 						struct pt_regs *regs,
 						unsigned long fault_address)
 {
-	printk(KERN_EMERG "BUG: stack guard page was hit at %p (stack is %p..%p)\n",
-		 (void *)fault_address, current->stack,
-		 (char *)current->stack + THREAD_SIZE - 1);
-	die(message, regs, 0);
+	/*
+	 * It's not safe to kill the task, as it's in kernel space and
+	 * might be holding important locks. Just panic.
+	 */
 
-	/* Be absolutely certain we don't return. */
-	panic("%s", message);
+	panic("%s - stack guard page was hit at %p (stack is %p..%p)",
+	      message, (void *)fault_address, current->stack,
+	      (char *)current->stack + THREAD_SIZE - 1);
 }
 #endif
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] x86: panic when a kernel stack overflow is detected
  2019-07-29  1:59 [PATCH] x86: panic when a kernel stack overflow is detected Daniel Axtens
@ 2019-07-29  3:53 ` Andy Lutomirski
  2019-07-29 10:34   ` Peter Zijlstra
  0 siblings, 1 reply; 3+ messages in thread
From: Andy Lutomirski @ 2019-07-29  3:53 UTC (permalink / raw)
  To: Daniel Axtens, Peter Zijlstra
  Cc: kasan-dev, X86 ML, Andrey Ryabinin, Alexander Potapenko,
	Andrew Lutomirski, LKML, Marco Elver

On Sun, Jul 28, 2019 at 6:59 PM Daniel Axtens <dja@axtens.net> wrote:
>
> Currently, when a kernel stack overflow is detected via VMAP_STACK,
> the task is killed with die().
>
> This isn't safe, because we don't know how that process has affected
> kernel state. In particular, we don't know what locks have been taken.
> For example, we can hit a case with lkdtm where a thread takes a
> stack overflow in printk() after taking the logbuf_lock. In that case,
> we deadlock when the kernel next does a printk.
>
> Do not attempt to kill the process when a kernel stack overflow is
> detected. The system state is unknown, the only safe thing to do is
> panic(). (panic() also prints without taking locks so a useful debug
> splat is printed even when logbuf_lock is held.)

The thing I don't like about this is that it reduces the chance that
we successfully log anything to disk.

PeterZ, do you have any useful input here?  I wonder if we could do
something like printk_oh_crap() that is just printk() except that it
panics if it fails to return after a few seconds.

--Andy

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] x86: panic when a kernel stack overflow is detected
  2019-07-29  3:53 ` Andy Lutomirski
@ 2019-07-29 10:34   ` Peter Zijlstra
  0 siblings, 0 replies; 3+ messages in thread
From: Peter Zijlstra @ 2019-07-29 10:34 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Daniel Axtens, kasan-dev, X86 ML, Andrey Ryabinin,
	Alexander Potapenko, LKML, Marco Elver

On Sun, Jul 28, 2019 at 08:53:58PM -0700, Andy Lutomirski wrote:
> On Sun, Jul 28, 2019 at 6:59 PM Daniel Axtens <dja@axtens.net> wrote:
> >
> > Currently, when a kernel stack overflow is detected via VMAP_STACK,
> > the task is killed with die().
> >
> > This isn't safe, because we don't know how that process has affected
> > kernel state. In particular, we don't know what locks have been taken.
> > For example, we can hit a case with lkdtm where a thread takes a
> > stack overflow in printk() after taking the logbuf_lock. In that case,
> > we deadlock when the kernel next does a printk.
> >
> > Do not attempt to kill the process when a kernel stack overflow is
> > detected. The system state is unknown, the only safe thing to do is
> > panic(). (panic() also prints without taking locks so a useful debug
> > splat is printed even when logbuf_lock is held.)
> 
> The thing I don't like about this is that it reduces the chance that
> we successfully log anything to disk.
> 
> PeterZ, do you have any useful input here?  I wonder if we could do
> something like printk_oh_crap() that is just printk() except that it
> panics if it fails to return after a few seconds.

People are already had at work rewriting printk. The current thing is
unfixable.  Then again, I don't know if there's any sane options aside
of early serial.

Still, mucking with printk won't help you at all if the task is holding
some other/filesystem lock required to do that writeback.



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-07-29 10:35 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-29  1:59 [PATCH] x86: panic when a kernel stack overflow is detected Daniel Axtens
2019-07-29  3:53 ` Andy Lutomirski
2019-07-29 10:34   ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).