[PATCH v2 0/2] Introduce the pkill_on_warn parameter

* [PATCH v2 0/2] Introduce the pkill_on_warn parameter
@ 2021-10-27 23:32 Alexander Popov
  2021-10-27 23:32 ` [PATCH v2 1/2] bug: do refactoring allowing to add a warning handling action Alexander Popov
                   ` (3 more replies)
  0 siblings, 4 replies; 32+ messages in thread
From: Alexander Popov @ 2021-10-27 23:32 UTC (permalink / raw)
  To: Jonathan Corbet, Linus Torvalds, Paul McKenney, Andrew Morton,
	Thomas Gleixner, Peter Zijlstra, Joerg Roedel, Maciej Rozycki,
	Muchun Song, Viresh Kumar, Robin Murphy, Randy Dunlap, Lu Baolu,
	Petr Mladek, Kees Cook, Luis Chamberlain, Wei Liu, John Ogness,
	Andy Shevchenko, Alexey Kardashevskiy, Christophe Leroy,
	Jann Horn, Greg Kroah-Hartman, Mark Rutland, Andy Lutomirski,
	Dave Hansen, Steven Rostedt, Will Deacon, Ard Biesheuvel,
	Laura Abbott, David S Miller, Borislav Petkov, Arnd Bergmann,
	Andrew Scull, Marc Zyngier, Jessica Yu, Iurii Zaikin,
	Rasmus Villemoes, Wang Qing, Mel Gorman, Mauro Carvalho Chehab,
	Andrew Klychkov, Mathieu Chouquet-Stringer, Daniel Borkmann,
	Stephen Kitt, Stephen Boyd, Thomas Bogendoerfer, Mike Rapoport,
	Bjorn Andersson, Alexander Popov, kernel-hardening,
	linux-hardening, linux-doc, linux-arch, linux-kernel,
	linux-fsdevel
  Cc: notify

Hello! This is the v2 of pkill_on_warn.
Changes from v1 and tricks for testing are described below.

Rationale
=========

Currently, the Linux kernel provides two types of reaction to kernel
warnings:
 1. Do nothing (by default),
 2. Call panic() if panic_on_warn is set. That's a very strong reaction,
    so panic_on_warn is usually disabled on production systems.

From a safety point of view, the Linux kernel misses a middle way of
handling kernel warnings:
 - The kernel should stop the activity that provokes a warning,
 - But the kernel should avoid complete denial of service.

From a security point of view, kernel warning messages provide a lot of
useful information for attackers. Many GNU/Linux distributions allow
unprivileged users to read the kernel log, so attackers use kernel
warning infoleak in vulnerability exploits. See the examples:
https://a13xp0p0v.github.io/2021/02/09/CVE-2021-26708.html
https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
https://googleprojectzero.blogspot.com/2018/09/a-cache-invalidation-bug-in-linux.html

Let's introduce the pkill_on_warn sysctl.
If this parameter is set, the kernel kills all threads in a process that
provoked a kernel warning. This behavior is reasonable from a safety point of
view described above. It is also useful for kernel security hardening because
the system kills an exploit process that hits a kernel warning.

Moreover, bugs usually don't come alone, and a kernel warning may be
followed by memory corruption or other bad effects. So pkill_on_warn allows
the kernel to stop the process when the first signs of wrong behavior
are detected.


Changes from v1
===============

1) Introduce do_pkill_on_warn() and call it in all warning handling paths.

2) Do refactoring without functional changes in a separate patch.

3) Avoid killing init and kthreads.

4) Use do_send_sig_info() instead of do_group_exit().

5) Introduce sysctl instead of using core_param().


Tricks for testing
==================

1) This patch series was tested on x86_64 using CONFIG_LKDTM.
The kernel kills a process that performs this:
  echo WARNING > /sys/kernel/debug/provoke-crash/DIRECT

2) The warn_slowpath_fmt() path was tested using this trick:

diff --git a/arch/x86/include/asm/bug.h b/arch/x86/include/asm/bug.h
index 84b87538a15d..3106c203ebb6 100644
--- a/arch/x86/include/asm/bug.h
+++ b/arch/x86/include/asm/bug.h
@@ -73,7 +73,7 @@ do {                                                          \
  * were to trigger, we'd rather wreck the machine in an attempt to get the
  * message out than not know about it.
  */
-#define __WARN_FLAGS(flags)                                    \
+#define ___WARN_FLAGS(flags)                                   \
 do {                                                           \
        instrumentation_begin();                                \
        _BUG_FLAGS(ASM_UD2, BUGFLAG_WARNING|(flags));           \

3) Testing pkill_on_warn with kthreads was done using this trick:
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index bce848e50512..13c56f472681 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2133,6 +2133,8 @@ static int __noreturn rcu_gp_kthread(void *unused)
                WRITE_ONCE(rcu_state.gp_state, RCU_GP_CLEANUP);
                rcu_gp_cleanup();
                WRITE_ONCE(rcu_state.gp_state, RCU_GP_CLEANED);
+
+               WARN_ONCE(1, "hello from kthread\n");
        }
 }

4) Changing drivers/misc/lkdtm/bugs.c:lkdtm_WARNING() allowed me
to test all warning flavours:
 - WARN_ON()
 - WARN()
 - WARN_TAINT()
 - WARN_ON_ONCE()
 - WARN_ONCE()
 - WARN_TAINT_ONCE()

Thanks!

Alexander Popov (2):
  bug: do refactoring allowing to add a warning handling action
  sysctl: introduce kernel.pkill_on_warn

 Documentation/admin-guide/sysctl/kernel.rst | 14 ++++++++
 include/asm-generic/bug.h                   | 37 +++++++++++++++------
 include/linux/panic.h                       |  3 ++
 kernel/panic.c                              | 22 +++++++++++-
 kernel/sysctl.c                             |  9 +++++
 lib/bug.c                                   | 22 ++++++++----
 6 files changed, 90 insertions(+), 17 deletions(-)

-- 
2.31.1


^ permalink raw reply related	[flat|nested] 32+ messages in thread