x86: panic when a kernel stack overflow is detected
diff mbox series

Message ID 20190729015933.18049-1-dja@axtens.net
State New, archived
Headers show
Series
  • x86: panic when a kernel stack overflow is detected
Related show

Commit Message

Daniel Axtens July 29, 2019, 1:59 a.m. UTC
Currently, when a kernel stack overflow is detected via VMAP_STACK,
the task is killed with die().

This isn't safe, because we don't know how that process has affected
kernel state. In particular, we don't know what locks have been taken.
For example, we can hit a case with lkdtm where a thread takes a
stack overflow in printk() after taking the logbuf_lock. In that case,
we deadlock when the kernel next does a printk.

Do not attempt to kill the process when a kernel stack overflow is
detected. The system state is unknown, the only safe thing to do is
panic(). (panic() also prints without taking locks so a useful debug
splat is printed even when logbuf_lock is held.)

Reported-by: Marco Elver <elver@google.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
---
 arch/x86/kernel/traps.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

Comments

Andy Lutomirski July 29, 2019, 3:53 a.m. UTC | #1
On Sun, Jul 28, 2019 at 6:59 PM Daniel Axtens <dja@axtens.net> wrote:
>
> Currently, when a kernel stack overflow is detected via VMAP_STACK,
> the task is killed with die().
>
> This isn't safe, because we don't know how that process has affected
> kernel state. In particular, we don't know what locks have been taken.
> For example, we can hit a case with lkdtm where a thread takes a
> stack overflow in printk() after taking the logbuf_lock. In that case,
> we deadlock when the kernel next does a printk.
>
> Do not attempt to kill the process when a kernel stack overflow is
> detected. The system state is unknown, the only safe thing to do is
> panic(). (panic() also prints without taking locks so a useful debug
> splat is printed even when logbuf_lock is held.)

The thing I don't like about this is that it reduces the chance that
we successfully log anything to disk.

PeterZ, do you have any useful input here?  I wonder if we could do
something like printk_oh_crap() that is just printk() except that it
panics if it fails to return after a few seconds.

--Andy
Peter Zijlstra July 29, 2019, 10:34 a.m. UTC | #2
On Sun, Jul 28, 2019 at 08:53:58PM -0700, Andy Lutomirski wrote:
> On Sun, Jul 28, 2019 at 6:59 PM Daniel Axtens <dja@axtens.net> wrote:
> >
> > Currently, when a kernel stack overflow is detected via VMAP_STACK,
> > the task is killed with die().
> >
> > This isn't safe, because we don't know how that process has affected
> > kernel state. In particular, we don't know what locks have been taken.
> > For example, we can hit a case with lkdtm where a thread takes a
> > stack overflow in printk() after taking the logbuf_lock. In that case,
> > we deadlock when the kernel next does a printk.
> >
> > Do not attempt to kill the process when a kernel stack overflow is
> > detected. The system state is unknown, the only safe thing to do is
> > panic(). (panic() also prints without taking locks so a useful debug
> > splat is printed even when logbuf_lock is held.)
> 
> The thing I don't like about this is that it reduces the chance that
> we successfully log anything to disk.
> 
> PeterZ, do you have any useful input here?  I wonder if we could do
> something like printk_oh_crap() that is just printk() except that it
> panics if it fails to return after a few seconds.

People are already had at work rewriting printk. The current thing is
unfixable.  Then again, I don't know if there's any sane options aside
of early serial.

Still, mucking with printk won't help you at all if the task is holding
some other/filesystem lock required to do that writeback.

Patch
diff mbox series

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 4bb0f8447112..bfb0ec667c09 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -301,13 +301,14 @@  __visible void __noreturn handle_stack_overflow(const char *message,
 						struct pt_regs *regs,
 						unsigned long fault_address)
 {
-	printk(KERN_EMERG "BUG: stack guard page was hit at %p (stack is %p..%p)\n",
-		 (void *)fault_address, current->stack,
-		 (char *)current->stack + THREAD_SIZE - 1);
-	die(message, regs, 0);
+	/*
+	 * It's not safe to kill the task, as it's in kernel space and
+	 * might be holding important locks. Just panic.
+	 */
 
-	/* Be absolutely certain we don't return. */
-	panic("%s", message);
+	panic("%s - stack guard page was hit at %p (stack is %p..%p)",
+	      message, (void *)fault_address, current->stack,
+	      (char *)current->stack + THREAD_SIZE - 1);
 }
 #endif