linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] bug: Provide toggle for BUG on data corruption
@ 2016-08-16 21:11 Kees Cook
  2016-08-16 21:11 ` [PATCH 1/5] list: Split list_add() debug checking into separate function Kees Cook
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Kees Cook @ 2016-08-16 21:11 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: Kees Cook, Stephen Boyd, Daniel Micay, Arnd Bergmann,
	Greg Kroah-Hartman, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Peter Zijlstra, Ingo Molnar,
	Tejun Heo, Michael Ellerman, Aneesh Kumar K.V,
	Kirill A. Shutemov, Andrew Morton, Dan Williams, Jan Kara,
	Josef Bacik, Thomas Gleixner, Andrey Ryabinin,
	Nikolay Aleksandrov, Dmitry Vyukov, linux-kernel,
	kernel-hardening

This adds a CONFIG to trigger BUG()s when the kernel encounters
unexpected data structure integrity as currently detected with
CONFIG_DEBUG_LIST, CONFIG_DEBUG_SPINLOCK, and with workqueues.

Specifically list operations have been a target for widening flaws to gain
"write anywhere" primitives for attackers, so this also consolidates the
debug checking to avoid code and check duplication (e.g. RCU list debug
was missing a check that got added to regular list debug). It also stops
manipulations when corruption is detected, since worsening the corruption
makes no sense. (Really, everyone should build with CONFIG_DEBUG_LIST
since the checks are so inexpensive.)

This is mostly a refactoring of similar code from PaX and Grsecurity,
along with MSM kernel changes by Stephen Boyd.

Along with the patches is a new lkdtm test to validate that setting
CONFIG_DEBUG_LIST actually does what is desired.

Thanks,

-Kees

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/5] list: Split list_add() debug checking into separate function
  2016-08-16 21:11 [PATCH 0/5] bug: Provide toggle for BUG on data corruption Kees Cook
@ 2016-08-16 21:11 ` Kees Cook
  2016-08-16 21:11 ` [PATCH 2/5] rculist: Consolidate DEBUG_LIST for list_add_rcu() Kees Cook
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 18+ messages in thread
From: Kees Cook @ 2016-08-16 21:11 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: Kees Cook, Stephen Boyd, Daniel Micay, Arnd Bergmann,
	Greg Kroah-Hartman, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Peter Zijlstra, Ingo Molnar,
	Tejun Heo, Michael Ellerman, Aneesh Kumar K.V,
	Kirill A. Shutemov, Andrew Morton, Dan Williams, Jan Kara,
	Josef Bacik, Thomas Gleixner, Andrey Ryabinin,
	Nikolay Aleksandrov, Dmitry Vyukov, linux-kernel,
	kernel-hardening

Right now, __list_add() code is repeated either in list.h or in
list_debug.c, but only the debug checks are the different part. This
extracts the checking into a separate function and consolidates
__list_add(). Additionally this __list_add_debug() will stop list
manipulations if a corruption is detected, instead of allowing for further
corruption that may lead to even worse conditions.

This is slight refactoring of the same hardening done in PaX and Grsecurity.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 include/linux/list.h | 22 ++++++++++++++++------
 lib/list_debug.c     | 35 ++++++++++++++++++-----------------
 2 files changed, 34 insertions(+), 23 deletions(-)

diff --git a/include/linux/list.h b/include/linux/list.h
index 5183138aa932..c38ff652ab59 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -28,27 +28,37 @@ static inline void INIT_LIST_HEAD(struct list_head *list)
 	list->prev = list;
 }
 
+#ifdef CONFIG_DEBUG_LIST
+extern bool __list_add_debug(struct list_head *new,
+			      struct list_head *prev,
+			      struct list_head *next);
+#else
+static inline bool __list_add_debug(struct list_head *new,
+				struct list_head *prev,
+				struct list_head *next)
+{
+	return true;
+}
+#endif
+
 /*
  * Insert a new entry between two known consecutive entries.
  *
  * This is only for internal list manipulation where we know
  * the prev/next entries already!
  */
-#ifndef CONFIG_DEBUG_LIST
 static inline void __list_add(struct list_head *new,
 			      struct list_head *prev,
 			      struct list_head *next)
 {
+	if (!__list_add_debug(new, prev, next))
+		return;
+
 	next->prev = new;
 	new->next = next;
 	new->prev = prev;
 	WRITE_ONCE(prev->next, new);
 }
-#else
-extern void __list_add(struct list_head *new,
-			      struct list_head *prev,
-			      struct list_head *next);
-#endif
 
 /**
  * list_add - add a new entry
diff --git a/lib/list_debug.c b/lib/list_debug.c
index 3859bf63561c..5d78982eeb99 100644
--- a/lib/list_debug.c
+++ b/lib/list_debug.c
@@ -19,27 +19,28 @@
  * the prev/next entries already!
  */
 
-void __list_add(struct list_head *new,
+bool __list_add_debug(struct list_head *new,
 			      struct list_head *prev,
 			      struct list_head *next)
 {
-	WARN(next->prev != prev,
-		"list_add corruption. next->prev should be "
-		"prev (%p), but was %p. (next=%p).\n",
-		prev, next->prev, next);
-	WARN(prev->next != next,
-		"list_add corruption. prev->next should be "
-		"next (%p), but was %p. (prev=%p).\n",
-		next, prev->next, prev);
-	WARN(new == prev || new == next,
-	     "list_add double add: new=%p, prev=%p, next=%p.\n",
-	     new, prev, next);
-	next->prev = new;
-	new->next = next;
-	new->prev = prev;
-	WRITE_ONCE(prev->next, new);
+	if (unlikely(next->prev != prev)) {
+		WARN(1, "list_add corruption. next->prev should be prev (%p), but was %p. (next=%p).\n",
+			prev, next->prev, next);
+		return false;
+	}
+	if (unlikely(prev->next != next)) {
+		WARN(1, "list_add corruption. prev->next should be next (%p), but was %p. (prev=%p).\n",
+			next, prev->next, prev);
+		return false;
+	}
+	if (unlikely(new == prev || new == next)) {
+		WARN(1, "list_add double add: new=%p, prev=%p, next=%p.\n",
+			new, prev, next);
+		return false;
+	}
+	return true;
 }
-EXPORT_SYMBOL(__list_add);
+EXPORT_SYMBOL(__list_add_debug);
 
 void __list_del_entry(struct list_head *entry)
 {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/5] rculist: Consolidate DEBUG_LIST for list_add_rcu()
  2016-08-16 21:11 [PATCH 0/5] bug: Provide toggle for BUG on data corruption Kees Cook
  2016-08-16 21:11 ` [PATCH 1/5] list: Split list_add() debug checking into separate function Kees Cook
@ 2016-08-16 21:11 ` Kees Cook
  2016-08-16 21:11 ` [PATCH 3/5] list: Split list_del() debug checking into separate function Kees Cook
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 18+ messages in thread
From: Kees Cook @ 2016-08-16 21:11 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: Kees Cook, Stephen Boyd, Daniel Micay, Arnd Bergmann,
	Greg Kroah-Hartman, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Peter Zijlstra, Ingo Molnar,
	Tejun Heo, Michael Ellerman, Aneesh Kumar K.V,
	Kirill A. Shutemov, Andrew Morton, Dan Williams, Jan Kara,
	Josef Bacik, Thomas Gleixner, Andrey Ryabinin,
	Nikolay Aleksandrov, Dmitry Vyukov, linux-kernel,
	kernel-hardening

Consolidates the debug checking for list_add_rcu() into the new single
debug function. Notably, this fixes the sanity check that was added in
 commit 17a801f4bfeb ("list_debug: WARN for adding something already in
the list"). Before, it wasn't being checked for RCU lists.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 include/linux/rculist.h |  8 +++-----
 lib/list_debug.c        | 19 -------------------
 2 files changed, 3 insertions(+), 24 deletions(-)

diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 8beb98dcf14f..b21923e6e9b3 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -45,19 +45,17 @@ static inline void INIT_LIST_HEAD_RCU(struct list_head *list)
  * This is only for internal list manipulation where we know
  * the prev/next entries already!
  */
-#ifndef CONFIG_DEBUG_LIST
 static inline void __list_add_rcu(struct list_head *new,
 		struct list_head *prev, struct list_head *next)
 {
+	if (!__list_add_debug(new, prev, next))
+		return;
+
 	new->next = next;
 	new->prev = prev;
 	rcu_assign_pointer(list_next_rcu(prev), new);
 	next->prev = new;
 }
-#else
-void __list_add_rcu(struct list_head *new,
-		    struct list_head *prev, struct list_head *next);
-#endif
 
 /**
  * list_add_rcu - add a new entry to rcu-protected list
diff --git a/lib/list_debug.c b/lib/list_debug.c
index 5d78982eeb99..e90a931bd5ab 100644
--- a/lib/list_debug.c
+++ b/lib/list_debug.c
@@ -80,22 +80,3 @@ void list_del(struct list_head *entry)
 	entry->prev = LIST_POISON2;
 }
 EXPORT_SYMBOL(list_del);
-
-/*
- * RCU variants.
- */
-void __list_add_rcu(struct list_head *new,
-		    struct list_head *prev, struct list_head *next)
-{
-	WARN(next->prev != prev,
-		"list_add_rcu corruption. next->prev should be prev (%p), but was %p. (next=%p).\n",
-		prev, next->prev, next);
-	WARN(prev->next != next,
-		"list_add_rcu corruption. prev->next should be next (%p), but was %p. (prev=%p).\n",
-		next, prev->next, prev);
-	new->next = next;
-	new->prev = prev;
-	rcu_assign_pointer(list_next_rcu(prev), new);
-	next->prev = new;
-}
-EXPORT_SYMBOL(__list_add_rcu);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/5] list: Split list_del() debug checking into separate function
  2016-08-16 21:11 [PATCH 0/5] bug: Provide toggle for BUG on data corruption Kees Cook
  2016-08-16 21:11 ` [PATCH 1/5] list: Split list_add() debug checking into separate function Kees Cook
  2016-08-16 21:11 ` [PATCH 2/5] rculist: Consolidate DEBUG_LIST for list_add_rcu() Kees Cook
@ 2016-08-16 21:11 ` Kees Cook
  2016-08-16 21:11 ` [PATCH 4/5] bug: Provide toggle for BUG on data corruption Kees Cook
  2016-08-16 21:11 ` [PATCH 5/5] lkdtm: Add tests for struct list corruption Kees Cook
  4 siblings, 0 replies; 18+ messages in thread
From: Kees Cook @ 2016-08-16 21:11 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: Kees Cook, Stephen Boyd, Daniel Micay, Arnd Bergmann,
	Greg Kroah-Hartman, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Peter Zijlstra, Ingo Molnar,
	Tejun Heo, Michael Ellerman, Aneesh Kumar K.V,
	Kirill A. Shutemov, Andrew Morton, Dan Williams, Jan Kara,
	Josef Bacik, Thomas Gleixner, Andrey Ryabinin,
	Nikolay Aleksandrov, Dmitry Vyukov, linux-kernel,
	kernel-hardening

Similar to the list_add() debug consolidation, this consolidates the
debug checking performed during CONFIG_DEBUG_LIST, and stops list
updates when corruption is found.

Refactored from same hardening in PaX and Grsecurity.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 include/linux/list.h | 15 +++++++++------
 lib/list_debug.c     | 53 +++++++++++++++++++++++-----------------------------
 2 files changed, 32 insertions(+), 36 deletions(-)

diff --git a/include/linux/list.h b/include/linux/list.h
index c38ff652ab59..09b857a0dda1 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -32,6 +32,7 @@ static inline void INIT_LIST_HEAD(struct list_head *list)
 extern bool __list_add_debug(struct list_head *new,
 			      struct list_head *prev,
 			      struct list_head *next);
+extern bool __list_del_entry_debug(struct list_head *entry);
 #else
 static inline bool __list_add_debug(struct list_head *new,
 				struct list_head *prev,
@@ -39,6 +40,10 @@ static inline bool __list_add_debug(struct list_head *new,
 {
 	return true;
 }
+static inline bool __list_del_entry_debug(struct list_head *entry)
+{
+	return true;
+}
 #endif
 
 /*
@@ -106,22 +111,20 @@ static inline void __list_del(struct list_head * prev, struct list_head * next)
  * Note: list_empty() on entry does not return true after this, the entry is
  * in an undefined state.
  */
-#ifndef CONFIG_DEBUG_LIST
 static inline void __list_del_entry(struct list_head *entry)
 {
+	if (!__list_del_entry_debug(entry))
+		return;
+
 	__list_del(entry->prev, entry->next);
 }
 
 static inline void list_del(struct list_head *entry)
 {
-	__list_del(entry->prev, entry->next);
+	__list_del_entry(entry);
 	entry->next = LIST_POISON1;
 	entry->prev = LIST_POISON2;
 }
-#else
-extern void __list_del_entry(struct list_head *entry);
-extern void list_del(struct list_head *entry);
-#endif
 
 /**
  * list_replace - replace old entry by new one
diff --git a/lib/list_debug.c b/lib/list_debug.c
index e90a931bd5ab..80e2e40a6a4e 100644
--- a/lib/list_debug.c
+++ b/lib/list_debug.c
@@ -42,41 +42,34 @@ bool __list_add_debug(struct list_head *new,
 }
 EXPORT_SYMBOL(__list_add_debug);
 
-void __list_del_entry(struct list_head *entry)
+bool __list_del_entry_debug(struct list_head *entry)
 {
 	struct list_head *prev, *next;
 
 	prev = entry->prev;
 	next = entry->next;
 
-	if (WARN(next == LIST_POISON1,
-		"list_del corruption, %p->next is LIST_POISON1 (%p)\n",
-		entry, LIST_POISON1) ||
-	    WARN(prev == LIST_POISON2,
-		"list_del corruption, %p->prev is LIST_POISON2 (%p)\n",
-		entry, LIST_POISON2) ||
-	    WARN(prev->next != entry,
-		"list_del corruption. prev->next should be %p, "
-		"but was %p\n", entry, prev->next) ||
-	    WARN(next->prev != entry,
-		"list_del corruption. next->prev should be %p, "
-		"but was %p\n", entry, next->prev))
-		return;
-
-	__list_del(prev, next);
-}
-EXPORT_SYMBOL(__list_del_entry);
+	if (unlikely(next == LIST_POISON1)) {
+		WARN(1, "list_del corruption, %p->next is LIST_POISON1 (%p)\n",
+			entry, LIST_POISON1);
+		return false;
+	}
+	if (unlikely(prev == LIST_POISON2)) {
+		WARN(1, "list_del corruption, %p->prev is LIST_POISON2 (%p)\n",
+			entry, LIST_POISON2);
+		return false;
+	}
+	if (unlikely(prev->next != entry)) {
+		WARN(1, "list_del corruption. prev->next should be %p, but was %p\n",
+			entry, prev->next);
+		return false;
+	}
+	if (unlikely(next->prev != entry)) {
+		WARN(1, "list_del corruption. next->prev should be %p, but was %p\n",
+			entry, next->prev);
+		return false;
+	}
+	return true;
 
-/**
- * list_del - deletes entry from list.
- * @entry: the element to delete from the list.
- * Note: list_empty on entry does not return true after this, the entry is
- * in an undefined state.
- */
-void list_del(struct list_head *entry)
-{
-	__list_del_entry(entry);
-	entry->next = LIST_POISON1;
-	entry->prev = LIST_POISON2;
 }
-EXPORT_SYMBOL(list_del);
+EXPORT_SYMBOL(__list_del_entry_debug);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/5] bug: Provide toggle for BUG on data corruption
  2016-08-16 21:11 [PATCH 0/5] bug: Provide toggle for BUG on data corruption Kees Cook
                   ` (2 preceding siblings ...)
  2016-08-16 21:11 ` [PATCH 3/5] list: Split list_del() debug checking into separate function Kees Cook
@ 2016-08-16 21:11 ` Kees Cook
  2016-08-16 21:26   ` Paul E. McKenney
  2016-08-16 21:50   ` Laura Abbott
  2016-08-16 21:11 ` [PATCH 5/5] lkdtm: Add tests for struct list corruption Kees Cook
  4 siblings, 2 replies; 18+ messages in thread
From: Kees Cook @ 2016-08-16 21:11 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: Kees Cook, Stephen Boyd, Daniel Micay, Arnd Bergmann,
	Greg Kroah-Hartman, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Peter Zijlstra, Ingo Molnar,
	Tejun Heo, Michael Ellerman, Aneesh Kumar K.V,
	Kirill A. Shutemov, Andrew Morton, Dan Williams, Jan Kara,
	Josef Bacik, Thomas Gleixner, Andrey Ryabinin,
	Nikolay Aleksandrov, Dmitry Vyukov, linux-kernel,
	kernel-hardening

The kernel checks for several cases of data structure corruption under
either normal runtime, or under various CONFIG_DEBUG_* settings. When
corruption is detected, some systems may want to BUG() immediately instead
of letting the corruption continue. Many of these manipulation primitives
can be used by security flaws to gain arbitrary memory write control. This
provides CONFIG_BUG_ON_CORRUPTION to control newly added BUG() locations.

This is inspired by similar hardening in PaX and Grsecurity, and by
Stephen Boyd in MSM kernels.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 include/linux/bug.h             |  7 +++++++
 kernel/locking/spinlock_debug.c |  1 +
 kernel/workqueue.c              |  2 ++
 lib/Kconfig.debug               | 10 ++++++++++
 lib/list_debug.c                |  7 +++++++
 5 files changed, 27 insertions(+)

diff --git a/include/linux/bug.h b/include/linux/bug.h
index e51b0709e78d..7e69758dd798 100644
--- a/include/linux/bug.h
+++ b/include/linux/bug.h
@@ -118,4 +118,11 @@ static inline enum bug_trap_type report_bug(unsigned long bug_addr,
 }
 
 #endif	/* CONFIG_GENERIC_BUG */
+
+#ifdef CONFIG_BUG_ON_CORRUPTION
+# define CORRUPTED_DATA_STRUCTURE	true
+#else
+# define CORRUPTED_DATA_STRUCTURE	false
+#endif
+
 #endif	/* _LINUX_BUG_H */
diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
index 0374a596cffa..d5f833769feb 100644
--- a/kernel/locking/spinlock_debug.c
+++ b/kernel/locking/spinlock_debug.c
@@ -64,6 +64,7 @@ static void spin_dump(raw_spinlock_t *lock, const char *msg)
 		owner ? owner->comm : "<none>",
 		owner ? task_pid_nr(owner) : -1,
 		lock->owner_cpu);
+	BUG_ON(CORRUPTED_DATA_STRUCTURE);
 	dump_stack();
 }
 
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index ef071ca73fc3..ea0132b55eca 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -48,6 +48,7 @@
 #include <linux/nodemask.h>
 #include <linux/moduleparam.h>
 #include <linux/uaccess.h>
+#include <linux/bug.h>
 
 #include "workqueue_internal.h"
 
@@ -2108,6 +2109,7 @@ __acquires(&pool->lock)
 		       current->comm, preempt_count(), task_pid_nr(current),
 		       worker->current_func);
 		debug_show_held_locks(current);
+		BUG_ON(CORRUPTED_DATA_STRUCTURE);
 		dump_stack();
 	}
 
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 2307d7c89dac..d64bd6b6fd45 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1987,6 +1987,16 @@ config TEST_STATIC_KEYS
 
 	  If unsure, say N.
 
+config BUG_ON_CORRUPTION
+	bool "Trigger a BUG when data corruption is detected"
+	help
+	  Select this option if the kernel should BUG when it encounters
+	  data corruption in various kernel memory structures during checks
+	  added by options like CONFIG_DEBUG_LIST, CONFIG_DEBUG_SPINLOCK,
+	  etc.
+
+	  If unsure, say N.
+
 source "samples/Kconfig"
 
 source "lib/Kconfig.kgdb"
diff --git a/lib/list_debug.c b/lib/list_debug.c
index 80e2e40a6a4e..161c7e7d3478 100644
--- a/lib/list_debug.c
+++ b/lib/list_debug.c
@@ -26,16 +26,19 @@ bool __list_add_debug(struct list_head *new,
 	if (unlikely(next->prev != prev)) {
 		WARN(1, "list_add corruption. next->prev should be prev (%p), but was %p. (next=%p).\n",
 			prev, next->prev, next);
+		BUG_ON(CORRUPTED_DATA_STRUCTURE);
 		return false;
 	}
 	if (unlikely(prev->next != next)) {
 		WARN(1, "list_add corruption. prev->next should be next (%p), but was %p. (prev=%p).\n",
 			next, prev->next, prev);
+		BUG_ON(CORRUPTED_DATA_STRUCTURE);
 		return false;
 	}
 	if (unlikely(new == prev || new == next)) {
 		WARN(1, "list_add double add: new=%p, prev=%p, next=%p.\n",
 			new, prev, next);
+		BUG_ON(CORRUPTED_DATA_STRUCTURE);
 		return false;
 	}
 	return true;
@@ -52,21 +55,25 @@ bool __list_del_entry_debug(struct list_head *entry)
 	if (unlikely(next == LIST_POISON1)) {
 		WARN(1, "list_del corruption, %p->next is LIST_POISON1 (%p)\n",
 			entry, LIST_POISON1);
+		BUG_ON(CORRUPTED_DATA_STRUCTURE);
 		return false;
 	}
 	if (unlikely(prev == LIST_POISON2)) {
 		WARN(1, "list_del corruption, %p->prev is LIST_POISON2 (%p)\n",
 			entry, LIST_POISON2);
+		BUG_ON(CORRUPTED_DATA_STRUCTURE);
 		return false;
 	}
 	if (unlikely(prev->next != entry)) {
 		WARN(1, "list_del corruption. prev->next should be %p, but was %p\n",
 			entry, prev->next);
+		BUG_ON(CORRUPTED_DATA_STRUCTURE);
 		return false;
 	}
 	if (unlikely(next->prev != entry)) {
 		WARN(1, "list_del corruption. next->prev should be %p, but was %p\n",
 			entry, next->prev);
+		BUG_ON(CORRUPTED_DATA_STRUCTURE);
 		return false;
 	}
 	return true;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/5] lkdtm: Add tests for struct list corruption
  2016-08-16 21:11 [PATCH 0/5] bug: Provide toggle for BUG on data corruption Kees Cook
                   ` (3 preceding siblings ...)
  2016-08-16 21:11 ` [PATCH 4/5] bug: Provide toggle for BUG on data corruption Kees Cook
@ 2016-08-16 21:11 ` Kees Cook
  4 siblings, 0 replies; 18+ messages in thread
From: Kees Cook @ 2016-08-16 21:11 UTC (permalink / raw)
  To: Paul E . McKenney
  Cc: Kees Cook, Stephen Boyd, Daniel Micay, Arnd Bergmann,
	Greg Kroah-Hartman, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Peter Zijlstra, Ingo Molnar,
	Tejun Heo, Michael Ellerman, Aneesh Kumar K.V,
	Kirill A. Shutemov, Andrew Morton, Dan Williams, Jan Kara,
	Josef Bacik, Thomas Gleixner, Andrey Ryabinin,
	Nikolay Aleksandrov, Dmitry Vyukov, linux-kernel,
	kernel-hardening

When building under CONFIG_DEBUG_LIST, list addition and removal will be
sanity-checked. This validates that the check is working as expected by
setting up classic corruption attacks against list manipulations, available
with the new lkdtm tests CORRUPT_LIST_ADD and CORRUPT_LIST_DEL.

Signed-off-by: Kees Cook <keescook@chromium.org>
---
 drivers/misc/lkdtm.h      |  2 ++
 drivers/misc/lkdtm_bugs.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++
 drivers/misc/lkdtm_core.c |  2 ++
 3 files changed, 72 insertions(+)

diff --git a/drivers/misc/lkdtm.h b/drivers/misc/lkdtm.h
index fdf954c2107f..cfa1039c62e7 100644
--- a/drivers/misc/lkdtm.h
+++ b/drivers/misc/lkdtm.h
@@ -21,6 +21,8 @@ void lkdtm_SPINLOCKUP(void);
 void lkdtm_HUNG_TASK(void);
 void lkdtm_ATOMIC_UNDERFLOW(void);
 void lkdtm_ATOMIC_OVERFLOW(void);
+void lkdtm_CORRUPT_LIST_ADD(void);
+void lkdtm_CORRUPT_LIST_DEL(void);
 
 /* lkdtm_heap.c */
 void lkdtm_OVERWRITE_ALLOCATION(void);
diff --git a/drivers/misc/lkdtm_bugs.c b/drivers/misc/lkdtm_bugs.c
index 182ae1894b32..f336206d4b1f 100644
--- a/drivers/misc/lkdtm_bugs.c
+++ b/drivers/misc/lkdtm_bugs.c
@@ -5,8 +5,13 @@
  * test source files.
  */
 #include "lkdtm.h"
+#include <linux/list.h>
 #include <linux/sched.h>
 
+struct lkdtm_list {
+	struct list_head node;
+};
+
 /*
  * Make sure our attempts to over run the kernel stack doesn't trigger
  * a compiler warning when CONFIG_FRAME_WARN is set. Then make sure we
@@ -146,3 +151,66 @@ void lkdtm_ATOMIC_OVERFLOW(void)
 	pr_info("attempting bad atomic overflow\n");
 	atomic_inc(&over);
 }
+
+void lkdtm_CORRUPT_LIST_ADD(void)
+{
+	/*
+	 * Initially, an empty list via LIST_HEAD:
+	 *	test_head.next = &test_head
+	 *	test_head.prev = &test_head
+	 */
+	LIST_HEAD(test_head);
+	struct lkdtm_list good, bad;
+	void *target[2] = { };
+	void *redirection = &target;
+
+	pr_info("attempting good list addition\n");
+
+	/*
+	 * Adding to the list performs these actions:
+	 *	test_head.next->prev = &good.node
+	 *	good.node.next = test_head.next
+	 *	good.node.prev = test_head
+	 *	test_head.next = good.node
+	 */
+	list_add(&good.node, &test_head);
+
+	pr_info("attempting corrupted list addition\n");
+	/*
+	 * In simulating this "write what where" primitive, the "what" is
+	 * the address of &bad.node, and the "where" is the address held
+	 * by "redirection".
+	 */
+	test_head.next = redirection;
+	list_add(&bad.node, &test_head);
+
+	if (target[0] == NULL && target[1] == NULL)
+		pr_err("Overwrite did not happen, but no BUG?!\n");
+	else
+		pr_err("list_add() corruption not detected!\n");
+}
+
+void lkdtm_CORRUPT_LIST_DEL(void)
+{
+	LIST_HEAD(test_head);
+	struct lkdtm_list item;
+	void *target[2] = { };
+	void *redirection = &target;
+
+	list_add(&item.node, &test_head);
+
+	pr_info("attempting good list removal\n");
+	list_del(&item.node);
+
+	pr_info("attempting corrupted list removal\n");
+	list_add(&item.node, &test_head);
+
+	/* As with the list_add() test above, this corrupts "next". */
+	item.node.next = redirection;
+	list_del(&item.node);
+
+	if (target[0] == NULL && target[1] == NULL)
+		pr_err("Overwrite did not happen, but no BUG?!\n");
+	else
+		pr_err("list_del() corruption not detected!\n");
+}
diff --git a/drivers/misc/lkdtm_core.c b/drivers/misc/lkdtm_core.c
index f9154b8d67f6..7eeb71a75549 100644
--- a/drivers/misc/lkdtm_core.c
+++ b/drivers/misc/lkdtm_core.c
@@ -197,6 +197,8 @@ struct crashtype crashtypes[] = {
 	CRASHTYPE(EXCEPTION),
 	CRASHTYPE(LOOP),
 	CRASHTYPE(OVERFLOW),
+	CRASHTYPE(CORRUPT_LIST_ADD),
+	CRASHTYPE(CORRUPT_LIST_DEL),
 	CRASHTYPE(CORRUPT_STACK),
 	CRASHTYPE(UNALIGNED_LOAD_STORE_WRITE),
 	CRASHTYPE(OVERWRITE_ALLOCATION),
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] bug: Provide toggle for BUG on data corruption
  2016-08-16 21:11 ` [PATCH 4/5] bug: Provide toggle for BUG on data corruption Kees Cook
@ 2016-08-16 21:26   ` Paul E. McKenney
  2016-08-16 21:42     ` Kees Cook
  2016-08-16 21:50   ` Laura Abbott
  1 sibling, 1 reply; 18+ messages in thread
From: Paul E. McKenney @ 2016-08-16 21:26 UTC (permalink / raw)
  To: Kees Cook
  Cc: Stephen Boyd, Daniel Micay, Arnd Bergmann, Greg Kroah-Hartman,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Peter Zijlstra, Ingo Molnar, Tejun Heo, Michael Ellerman,
	Aneesh Kumar K.V, Kirill A. Shutemov, Andrew Morton,
	Dan Williams, Jan Kara, Josef Bacik, Thomas Gleixner,
	Andrey Ryabinin, Nikolay Aleksandrov, Dmitry Vyukov,
	linux-kernel, kernel-hardening

On Tue, Aug 16, 2016 at 02:11:04PM -0700, Kees Cook wrote:
> The kernel checks for several cases of data structure corruption under
> either normal runtime, or under various CONFIG_DEBUG_* settings. When
> corruption is detected, some systems may want to BUG() immediately instead
> of letting the corruption continue. Many of these manipulation primitives
> can be used by security flaws to gain arbitrary memory write control. This
> provides CONFIG_BUG_ON_CORRUPTION to control newly added BUG() locations.
> 
> This is inspired by similar hardening in PaX and Grsecurity, and by
> Stephen Boyd in MSM kernels.
> 
> Signed-off-by: Kees Cook <keescook@chromium.org>

OK, I will bite...  Why both the WARN() and the BUG_ON()?

								Thanx, Paul

> ---
>  include/linux/bug.h             |  7 +++++++
>  kernel/locking/spinlock_debug.c |  1 +
>  kernel/workqueue.c              |  2 ++
>  lib/Kconfig.debug               | 10 ++++++++++
>  lib/list_debug.c                |  7 +++++++
>  5 files changed, 27 insertions(+)
> 
> diff --git a/include/linux/bug.h b/include/linux/bug.h
> index e51b0709e78d..7e69758dd798 100644
> --- a/include/linux/bug.h
> +++ b/include/linux/bug.h
> @@ -118,4 +118,11 @@ static inline enum bug_trap_type report_bug(unsigned long bug_addr,
>  }
> 
>  #endif	/* CONFIG_GENERIC_BUG */
> +
> +#ifdef CONFIG_BUG_ON_CORRUPTION
> +# define CORRUPTED_DATA_STRUCTURE	true
> +#else
> +# define CORRUPTED_DATA_STRUCTURE	false
> +#endif
> +
>  #endif	/* _LINUX_BUG_H */
> diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
> index 0374a596cffa..d5f833769feb 100644
> --- a/kernel/locking/spinlock_debug.c
> +++ b/kernel/locking/spinlock_debug.c
> @@ -64,6 +64,7 @@ static void spin_dump(raw_spinlock_t *lock, const char *msg)
>  		owner ? owner->comm : "<none>",
>  		owner ? task_pid_nr(owner) : -1,
>  		lock->owner_cpu);
> +	BUG_ON(CORRUPTED_DATA_STRUCTURE);
>  	dump_stack();
>  }
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index ef071ca73fc3..ea0132b55eca 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -48,6 +48,7 @@
>  #include <linux/nodemask.h>
>  #include <linux/moduleparam.h>
>  #include <linux/uaccess.h>
> +#include <linux/bug.h>
> 
>  #include "workqueue_internal.h"
> 
> @@ -2108,6 +2109,7 @@ __acquires(&pool->lock)
>  		       current->comm, preempt_count(), task_pid_nr(current),
>  		       worker->current_func);
>  		debug_show_held_locks(current);
> +		BUG_ON(CORRUPTED_DATA_STRUCTURE);
>  		dump_stack();
>  	}
> 
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 2307d7c89dac..d64bd6b6fd45 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1987,6 +1987,16 @@ config TEST_STATIC_KEYS
> 
>  	  If unsure, say N.
> 
> +config BUG_ON_CORRUPTION
> +	bool "Trigger a BUG when data corruption is detected"
> +	help
> +	  Select this option if the kernel should BUG when it encounters
> +	  data corruption in various kernel memory structures during checks
> +	  added by options like CONFIG_DEBUG_LIST, CONFIG_DEBUG_SPINLOCK,
> +	  etc.
> +
> +	  If unsure, say N.
> +
>  source "samples/Kconfig"
> 
>  source "lib/Kconfig.kgdb"
> diff --git a/lib/list_debug.c b/lib/list_debug.c
> index 80e2e40a6a4e..161c7e7d3478 100644
> --- a/lib/list_debug.c
> +++ b/lib/list_debug.c
> @@ -26,16 +26,19 @@ bool __list_add_debug(struct list_head *new,
>  	if (unlikely(next->prev != prev)) {
>  		WARN(1, "list_add corruption. next->prev should be prev (%p), but was %p. (next=%p).\n",
>  			prev, next->prev, next);
> +		BUG_ON(CORRUPTED_DATA_STRUCTURE);
>  		return false;
>  	}
>  	if (unlikely(prev->next != next)) {
>  		WARN(1, "list_add corruption. prev->next should be next (%p), but was %p. (prev=%p).\n",
>  			next, prev->next, prev);
> +		BUG_ON(CORRUPTED_DATA_STRUCTURE);
>  		return false;
>  	}
>  	if (unlikely(new == prev || new == next)) {
>  		WARN(1, "list_add double add: new=%p, prev=%p, next=%p.\n",
>  			new, prev, next);
> +		BUG_ON(CORRUPTED_DATA_STRUCTURE);
>  		return false;
>  	}
>  	return true;
> @@ -52,21 +55,25 @@ bool __list_del_entry_debug(struct list_head *entry)
>  	if (unlikely(next == LIST_POISON1)) {
>  		WARN(1, "list_del corruption, %p->next is LIST_POISON1 (%p)\n",
>  			entry, LIST_POISON1);
> +		BUG_ON(CORRUPTED_DATA_STRUCTURE);
>  		return false;
>  	}
>  	if (unlikely(prev == LIST_POISON2)) {
>  		WARN(1, "list_del corruption, %p->prev is LIST_POISON2 (%p)\n",
>  			entry, LIST_POISON2);
> +		BUG_ON(CORRUPTED_DATA_STRUCTURE);
>  		return false;
>  	}
>  	if (unlikely(prev->next != entry)) {
>  		WARN(1, "list_del corruption. prev->next should be %p, but was %p\n",
>  			entry, prev->next);
> +		BUG_ON(CORRUPTED_DATA_STRUCTURE);
>  		return false;
>  	}
>  	if (unlikely(next->prev != entry)) {
>  		WARN(1, "list_del corruption. next->prev should be %p, but was %p\n",
>  			entry, next->prev);
> +		BUG_ON(CORRUPTED_DATA_STRUCTURE);
>  		return false;
>  	}
>  	return true;
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] bug: Provide toggle for BUG on data corruption
  2016-08-16 21:26   ` Paul E. McKenney
@ 2016-08-16 21:42     ` Kees Cook
  2016-08-16 21:53       ` Steven Rostedt
  2016-08-17  0:01       ` Paul E. McKenney
  0 siblings, 2 replies; 18+ messages in thread
From: Kees Cook @ 2016-08-16 21:42 UTC (permalink / raw)
  To: Paul McKenney
  Cc: Stephen Boyd, Daniel Micay, Arnd Bergmann, Greg Kroah-Hartman,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Peter Zijlstra, Ingo Molnar, Tejun Heo, Michael Ellerman,
	Aneesh Kumar K.V, Kirill A. Shutemov, Andrew Morton,
	Dan Williams, Jan Kara, Josef Bacik, Thomas Gleixner,
	Andrey Ryabinin, Nikolay Aleksandrov, Dmitry Vyukov, LKML,
	kernel-hardening, Joe Perches

On Tue, Aug 16, 2016 at 2:26 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Tue, Aug 16, 2016 at 02:11:04PM -0700, Kees Cook wrote:
>> The kernel checks for several cases of data structure corruption under
>> either normal runtime, or under various CONFIG_DEBUG_* settings. When
>> corruption is detected, some systems may want to BUG() immediately instead
>> of letting the corruption continue. Many of these manipulation primitives
>> can be used by security flaws to gain arbitrary memory write control. This
>> provides CONFIG_BUG_ON_CORRUPTION to control newly added BUG() locations.
>>
>> This is inspired by similar hardening in PaX and Grsecurity, and by
>> Stephen Boyd in MSM kernels.
>>
>> Signed-off-by: Kees Cook <keescook@chromium.org>
>
> OK, I will bite...  Why both the WARN() and the BUG_ON()?

Mostly because not every case of BUG(CORRUPTED_DATA_STRUCTURE) is
cleanly paired with a WARN (see the workqueue addition that wants to
dump locks too). I could rearrange things a bit, though, and create
something like:

#ifdef CONFIG_BUG_ON_CORRUPTION
# define CORRUPTED(format...) { \
    printk(KERN_ERR format); \
    BUG(); \
}
#else
# define CORRUPTED(format...) WARN(1, format)
#endif

What do you think?

-Kees

>
>                                                                 Thanx, Paul
>
>> ---
>>  include/linux/bug.h             |  7 +++++++
>>  kernel/locking/spinlock_debug.c |  1 +
>>  kernel/workqueue.c              |  2 ++
>>  lib/Kconfig.debug               | 10 ++++++++++
>>  lib/list_debug.c                |  7 +++++++
>>  5 files changed, 27 insertions(+)
>>
>> diff --git a/include/linux/bug.h b/include/linux/bug.h
>> index e51b0709e78d..7e69758dd798 100644
>> --- a/include/linux/bug.h
>> +++ b/include/linux/bug.h
>> @@ -118,4 +118,11 @@ static inline enum bug_trap_type report_bug(unsigned long bug_addr,
>>  }
>>
>>  #endif       /* CONFIG_GENERIC_BUG */
>> +
>> +#ifdef CONFIG_BUG_ON_CORRUPTION
>> +# define CORRUPTED_DATA_STRUCTURE    true
>> +#else
>> +# define CORRUPTED_DATA_STRUCTURE    false
>> +#endif
>> +
>>  #endif       /* _LINUX_BUG_H */
>> diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
>> index 0374a596cffa..d5f833769feb 100644
>> --- a/kernel/locking/spinlock_debug.c
>> +++ b/kernel/locking/spinlock_debug.c
>> @@ -64,6 +64,7 @@ static void spin_dump(raw_spinlock_t *lock, const char *msg)
>>               owner ? owner->comm : "<none>",
>>               owner ? task_pid_nr(owner) : -1,
>>               lock->owner_cpu);
>> +     BUG_ON(CORRUPTED_DATA_STRUCTURE);
>>       dump_stack();
>>  }
>>
>> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
>> index ef071ca73fc3..ea0132b55eca 100644
>> --- a/kernel/workqueue.c
>> +++ b/kernel/workqueue.c
>> @@ -48,6 +48,7 @@
>>  #include <linux/nodemask.h>
>>  #include <linux/moduleparam.h>
>>  #include <linux/uaccess.h>
>> +#include <linux/bug.h>
>>
>>  #include "workqueue_internal.h"
>>
>> @@ -2108,6 +2109,7 @@ __acquires(&pool->lock)
>>                      current->comm, preempt_count(), task_pid_nr(current),
>>                      worker->current_func);
>>               debug_show_held_locks(current);
>> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
>>               dump_stack();
>>       }
>>
>> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
>> index 2307d7c89dac..d64bd6b6fd45 100644
>> --- a/lib/Kconfig.debug
>> +++ b/lib/Kconfig.debug
>> @@ -1987,6 +1987,16 @@ config TEST_STATIC_KEYS
>>
>>         If unsure, say N.
>>
>> +config BUG_ON_CORRUPTION
>> +     bool "Trigger a BUG when data corruption is detected"
>> +     help
>> +       Select this option if the kernel should BUG when it encounters
>> +       data corruption in various kernel memory structures during checks
>> +       added by options like CONFIG_DEBUG_LIST, CONFIG_DEBUG_SPINLOCK,
>> +       etc.
>> +
>> +       If unsure, say N.
>> +
>>  source "samples/Kconfig"
>>
>>  source "lib/Kconfig.kgdb"
>> diff --git a/lib/list_debug.c b/lib/list_debug.c
>> index 80e2e40a6a4e..161c7e7d3478 100644
>> --- a/lib/list_debug.c
>> +++ b/lib/list_debug.c
>> @@ -26,16 +26,19 @@ bool __list_add_debug(struct list_head *new,
>>       if (unlikely(next->prev != prev)) {
>>               WARN(1, "list_add corruption. next->prev should be prev (%p), but was %p. (next=%p).\n",
>>                       prev, next->prev, next);
>> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
>>               return false;
>>       }
>>       if (unlikely(prev->next != next)) {
>>               WARN(1, "list_add corruption. prev->next should be next (%p), but was %p. (prev=%p).\n",
>>                       next, prev->next, prev);
>> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
>>               return false;
>>       }
>>       if (unlikely(new == prev || new == next)) {
>>               WARN(1, "list_add double add: new=%p, prev=%p, next=%p.\n",
>>                       new, prev, next);
>> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
>>               return false;
>>       }
>>       return true;
>> @@ -52,21 +55,25 @@ bool __list_del_entry_debug(struct list_head *entry)
>>       if (unlikely(next == LIST_POISON1)) {
>>               WARN(1, "list_del corruption, %p->next is LIST_POISON1 (%p)\n",
>>                       entry, LIST_POISON1);
>> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
>>               return false;
>>       }
>>       if (unlikely(prev == LIST_POISON2)) {
>>               WARN(1, "list_del corruption, %p->prev is LIST_POISON2 (%p)\n",
>>                       entry, LIST_POISON2);
>> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
>>               return false;
>>       }
>>       if (unlikely(prev->next != entry)) {
>>               WARN(1, "list_del corruption. prev->next should be %p, but was %p\n",
>>                       entry, prev->next);
>> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
>>               return false;
>>       }
>>       if (unlikely(next->prev != entry)) {
>>               WARN(1, "list_del corruption. next->prev should be %p, but was %p\n",
>>                       entry, next->prev);
>> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
>>               return false;
>>       }
>>       return true;
>> --
>> 2.7.4
>>
>



-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] bug: Provide toggle for BUG on data corruption
  2016-08-16 21:11 ` [PATCH 4/5] bug: Provide toggle for BUG on data corruption Kees Cook
  2016-08-16 21:26   ` Paul E. McKenney
@ 2016-08-16 21:50   ` Laura Abbott
  2016-08-16 23:11     ` Kees Cook
  1 sibling, 1 reply; 18+ messages in thread
From: Laura Abbott @ 2016-08-16 21:50 UTC (permalink / raw)
  To: Kees Cook, Paul E . McKenney
  Cc: Stephen Boyd, Daniel Micay, Arnd Bergmann, Greg Kroah-Hartman,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Peter Zijlstra, Ingo Molnar, Tejun Heo, Michael Ellerman,
	Aneesh Kumar K.V, Kirill A. Shutemov, Andrew Morton,
	Dan Williams, Jan Kara, Josef Bacik, Thomas Gleixner,
	Andrey Ryabinin, Nikolay Aleksandrov, Dmitry Vyukov,
	linux-kernel, kernel-hardening

On 08/16/2016 02:11 PM, Kees Cook wrote:
> The kernel checks for several cases of data structure corruption under
> either normal runtime, or under various CONFIG_DEBUG_* settings. When
> corruption is detected, some systems may want to BUG() immediately instead
> of letting the corruption continue. Many of these manipulation primitives
> can be used by security flaws to gain arbitrary memory write control. This
> provides CONFIG_BUG_ON_CORRUPTION to control newly added BUG() locations.
>
> This is inspired by similar hardening in PaX and Grsecurity, and by
> Stephen Boyd in MSM kernels.
>

This was never my favorite patch in the MSM tree, mostly because it seemed
to proliferate in random places. Some of these like the list were legit
corruption but others like the spinlock and workqueue were indication of
more hardware induced corruption and less kernel or software bugs.
I'd rather see the detection added in structures specifically identified to
be vulnerable. spinlocks and workqueues don't seem to be high on the
list to me. If they are, I think this could use some explanation of why
these in particular deserve checks vs. any other place where we might
detect corruption.

> Signed-off-by: Kees Cook <keescook@chromium.org>
> ---
>  include/linux/bug.h             |  7 +++++++
>  kernel/locking/spinlock_debug.c |  1 +
>  kernel/workqueue.c              |  2 ++
>  lib/Kconfig.debug               | 10 ++++++++++
>  lib/list_debug.c                |  7 +++++++
>  5 files changed, 27 insertions(+)
>
> diff --git a/include/linux/bug.h b/include/linux/bug.h
> index e51b0709e78d..7e69758dd798 100644
> --- a/include/linux/bug.h
> +++ b/include/linux/bug.h
> @@ -118,4 +118,11 @@ static inline enum bug_trap_type report_bug(unsigned long bug_addr,
>  }
>
>  #endif	/* CONFIG_GENERIC_BUG */
> +
> +#ifdef CONFIG_BUG_ON_CORRUPTION
> +# define CORRUPTED_DATA_STRUCTURE	true
> +#else
> +# define CORRUPTED_DATA_STRUCTURE	false
> +#endif
> +
>  #endif	/* _LINUX_BUG_H */
> diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
> index 0374a596cffa..d5f833769feb 100644
> --- a/kernel/locking/spinlock_debug.c
> +++ b/kernel/locking/spinlock_debug.c
> @@ -64,6 +64,7 @@ static void spin_dump(raw_spinlock_t *lock, const char *msg)
>  		owner ? owner->comm : "<none>",
>  		owner ? task_pid_nr(owner) : -1,
>  		lock->owner_cpu);
> +	BUG_ON(CORRUPTED_DATA_STRUCTURE);
>  	dump_stack();
>  }
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index ef071ca73fc3..ea0132b55eca 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -48,6 +48,7 @@
>  #include <linux/nodemask.h>
>  #include <linux/moduleparam.h>
>  #include <linux/uaccess.h>
> +#include <linux/bug.h>
>
>  #include "workqueue_internal.h"
>
> @@ -2108,6 +2109,7 @@ __acquires(&pool->lock)
>  		       current->comm, preempt_count(), task_pid_nr(current),
>  		       worker->current_func);
>  		debug_show_held_locks(current);
> +		BUG_ON(CORRUPTED_DATA_STRUCTURE);
>  		dump_stack();
>  	}
>
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 2307d7c89dac..d64bd6b6fd45 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1987,6 +1987,16 @@ config TEST_STATIC_KEYS
>
>  	  If unsure, say N.
>
> +config BUG_ON_CORRUPTION
> +	bool "Trigger a BUG when data corruption is detected"
> +	help
> +	  Select this option if the kernel should BUG when it encounters
> +	  data corruption in various kernel memory structures during checks
> +	  added by options like CONFIG_DEBUG_LIST, CONFIG_DEBUG_SPINLOCK,
> +	  etc.
> +
> +	  If unsure, say N.
> +
>  source "samples/Kconfig"
>
>  source "lib/Kconfig.kgdb"
> diff --git a/lib/list_debug.c b/lib/list_debug.c
> index 80e2e40a6a4e..161c7e7d3478 100644
> --- a/lib/list_debug.c
> +++ b/lib/list_debug.c
> @@ -26,16 +26,19 @@ bool __list_add_debug(struct list_head *new,
>  	if (unlikely(next->prev != prev)) {
>  		WARN(1, "list_add corruption. next->prev should be prev (%p), but was %p. (next=%p).\n",
>  			prev, next->prev, next);
> +		BUG_ON(CORRUPTED_DATA_STRUCTURE);
>  		return false;
>  	}
>  	if (unlikely(prev->next != next)) {
>  		WARN(1, "list_add corruption. prev->next should be next (%p), but was %p. (prev=%p).\n",
>  			next, prev->next, prev);
> +		BUG_ON(CORRUPTED_DATA_STRUCTURE);
>  		return false;
>  	}
>  	if (unlikely(new == prev || new == next)) {
>  		WARN(1, "list_add double add: new=%p, prev=%p, next=%p.\n",
>  			new, prev, next);
> +		BUG_ON(CORRUPTED_DATA_STRUCTURE);
>  		return false;
>  	}
>  	return true;
> @@ -52,21 +55,25 @@ bool __list_del_entry_debug(struct list_head *entry)
>  	if (unlikely(next == LIST_POISON1)) {
>  		WARN(1, "list_del corruption, %p->next is LIST_POISON1 (%p)\n",
>  			entry, LIST_POISON1);
> +		BUG_ON(CORRUPTED_DATA_STRUCTURE);
>  		return false;
>  	}
>  	if (unlikely(prev == LIST_POISON2)) {
>  		WARN(1, "list_del corruption, %p->prev is LIST_POISON2 (%p)\n",
>  			entry, LIST_POISON2);
> +		BUG_ON(CORRUPTED_DATA_STRUCTURE);
>  		return false;
>  	}
>  	if (unlikely(prev->next != entry)) {
>  		WARN(1, "list_del corruption. prev->next should be %p, but was %p\n",
>  			entry, prev->next);
> +		BUG_ON(CORRUPTED_DATA_STRUCTURE);
>  		return false;
>  	}
>  	if (unlikely(next->prev != entry)) {
>  		WARN(1, "list_del corruption. next->prev should be %p, but was %p\n",
>  			entry, next->prev);
> +		BUG_ON(CORRUPTED_DATA_STRUCTURE);
>  		return false;
>  	}
>  	return true;
>

The git history shows 924d9addb9b1 ("list debugging: use
WARN() instead of BUG()") deliberately switched this from BUG to WARN.
Do we still need the WARN at all or can we just switch to BUG permanently?

Thanks,
Laura

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] bug: Provide toggle for BUG on data corruption
  2016-08-16 21:42     ` Kees Cook
@ 2016-08-16 21:53       ` Steven Rostedt
  2016-08-16 21:57         ` Steven Rostedt
  2016-08-17  0:01       ` Paul E. McKenney
  1 sibling, 1 reply; 18+ messages in thread
From: Steven Rostedt @ 2016-08-16 21:53 UTC (permalink / raw)
  To: Kees Cook
  Cc: Paul McKenney, Stephen Boyd, Daniel Micay, Arnd Bergmann,
	Greg Kroah-Hartman, Josh Triplett, Mathieu Desnoyers,
	Lai Jiangshan, Peter Zijlstra, Ingo Molnar, Tejun Heo,
	Michael Ellerman, Aneesh Kumar K.V, Kirill A. Shutemov,
	Andrew Morton, Dan Williams, Jan Kara, Josef Bacik,
	Thomas Gleixner, Andrey Ryabinin, Nikolay Aleksandrov,
	Dmitry Vyukov, LKML, kernel-hardening, Joe Perches

On Tue, 16 Aug 2016 14:42:28 -0700
Kees Cook <keescook@chromium.org> wrote:


> > OK, I will bite...  Why both the WARN() and the BUG_ON()?  
> 
> Mostly because not every case of BUG(CORRUPTED_DATA_STRUCTURE) is
> cleanly paired with a WARN (see the workqueue addition that wants to
> dump locks too). I could rearrange things a bit, though, and create
> something like:
> 
> #ifdef CONFIG_BUG_ON_CORRUPTION
> # define CORRUPTED(format...) { \
>     printk(KERN_ERR format); \
>     BUG(); \
> }
> #else
> # define CORRUPTED(format...) WARN(1, format)
> #endif
> 
> What do you think?

I prefer what you currently have.

             WARN(1, "list_del corruption. next->prev should be %p, but was %p\n",
                     entry, next->prev);
             BUG_ON(CORRUPTED_DATA_STRUCTURE);

Will always warn (as stated by "1") and and the BUG_ON() will bug if
CORRUPTED_DATA_STRUCTURE is set. Although, I don't like that name. Can
we have a:

 BUG_ON(BUG_ON_CORRUPTED_DATA_STRUCTURES);

Or maybe have that as a macro:

#ifdef CONFIG_BUG_ON_CORRUPTION
# define BUG_ON_CORRUPTED_DATA_STRUCTURE() BUG_ON(1)
#else
# define BUG_ON_CORRUPTED_DATA_STRUCTURE() do {} while (0)
#endif

Then we can have:

             WARN(1, "list_del corruption. next->prev should be %p, but was %p\n",
                     entry, next->prev);
             BUG_ON_CORRUPTED_DATA_STRUCTURE();

??

-- Steve


> 
> -Kees
> 
> >
> >                                                                 Thanx, Paul
> >  
> >> ---
> >>  include/linux/bug.h             |  7 +++++++
> >>  kernel/locking/spinlock_debug.c |  1 +
> >>  kernel/workqueue.c              |  2 ++
> >>  lib/Kconfig.debug               | 10 ++++++++++
> >>  lib/list_debug.c                |  7 +++++++
> >>  5 files changed, 27 insertions(+)
> >>
> >> diff --git a/include/linux/bug.h b/include/linux/bug.h
> >> index e51b0709e78d..7e69758dd798 100644
> >> --- a/include/linux/bug.h
> >> +++ b/include/linux/bug.h
> >> @@ -118,4 +118,11 @@ static inline enum bug_trap_type report_bug(unsigned long bug_addr,
> >>  }
> >>
> >>  #endif       /* CONFIG_GENERIC_BUG */
> >> +
> >> +#ifdef CONFIG_BUG_ON_CORRUPTION
> >> +# define CORRUPTED_DATA_STRUCTURE    true
> >> +#else
> >> +# define CORRUPTED_DATA_STRUCTURE    false
> >> +#endif
> >> +
> >>  #endif       /* _LINUX_BUG_H */
> >> diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
> >> index 0374a596cffa..d5f833769feb 100644
> >> --- a/kernel/locking/spinlock_debug.c
> >> +++ b/kernel/locking/spinlock_debug.c
> >> @@ -64,6 +64,7 @@ static void spin_dump(raw_spinlock_t *lock, const char *msg)
> >>               owner ? owner->comm : "<none>",
> >>               owner ? task_pid_nr(owner) : -1,
> >>               lock->owner_cpu);
> >> +     BUG_ON(CORRUPTED_DATA_STRUCTURE);
> >>       dump_stack();
> >>  }
> >>
> >> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> >> index ef071ca73fc3..ea0132b55eca 100644
> >> --- a/kernel/workqueue.c
> >> +++ b/kernel/workqueue.c
> >> @@ -48,6 +48,7 @@
> >>  #include <linux/nodemask.h>
> >>  #include <linux/moduleparam.h>
> >>  #include <linux/uaccess.h>
> >> +#include <linux/bug.h>
> >>
> >>  #include "workqueue_internal.h"
> >>
> >> @@ -2108,6 +2109,7 @@ __acquires(&pool->lock)
> >>                      current->comm, preempt_count(), task_pid_nr(current),
> >>                      worker->current_func);
> >>               debug_show_held_locks(current);
> >> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
> >>               dump_stack();
> >>       }
> >>
> >> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> >> index 2307d7c89dac..d64bd6b6fd45 100644
> >> --- a/lib/Kconfig.debug
> >> +++ b/lib/Kconfig.debug
> >> @@ -1987,6 +1987,16 @@ config TEST_STATIC_KEYS
> >>
> >>         If unsure, say N.
> >>
> >> +config BUG_ON_CORRUPTION
> >> +     bool "Trigger a BUG when data corruption is detected"
> >> +     help
> >> +       Select this option if the kernel should BUG when it encounters
> >> +       data corruption in various kernel memory structures during checks
> >> +       added by options like CONFIG_DEBUG_LIST, CONFIG_DEBUG_SPINLOCK,
> >> +       etc.
> >> +
> >> +       If unsure, say N.
> >> +
> >>  source "samples/Kconfig"
> >>
> >>  source "lib/Kconfig.kgdb"
> >> diff --git a/lib/list_debug.c b/lib/list_debug.c
> >> index 80e2e40a6a4e..161c7e7d3478 100644
> >> --- a/lib/list_debug.c
> >> +++ b/lib/list_debug.c
> >> @@ -26,16 +26,19 @@ bool __list_add_debug(struct list_head *new,
> >>       if (unlikely(next->prev != prev)) {
> >>               WARN(1, "list_add corruption. next->prev should be prev (%p), but was %p. (next=%p).\n",
> >>                       prev, next->prev, next);
> >> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
> >>               return false;
> >>       }
> >>       if (unlikely(prev->next != next)) {
> >>               WARN(1, "list_add corruption. prev->next should be next (%p), but was %p. (prev=%p).\n",
> >>                       next, prev->next, prev);
> >> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
> >>               return false;
> >>       }
> >>       if (unlikely(new == prev || new == next)) {
> >>               WARN(1, "list_add double add: new=%p, prev=%p, next=%p.\n",
> >>                       new, prev, next);
> >> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
> >>               return false;
> >>       }
> >>       return true;
> >> @@ -52,21 +55,25 @@ bool __list_del_entry_debug(struct list_head *entry)
> >>       if (unlikely(next == LIST_POISON1)) {
> >>               WARN(1, "list_del corruption, %p->next is LIST_POISON1 (%p)\n",
> >>                       entry, LIST_POISON1);
> >> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
> >>               return false;
> >>       }
> >>       if (unlikely(prev == LIST_POISON2)) {
> >>               WARN(1, "list_del corruption, %p->prev is LIST_POISON2 (%p)\n",
> >>                       entry, LIST_POISON2);
> >> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
> >>               return false;
> >>       }
> >>       if (unlikely(prev->next != entry)) {
> >>               WARN(1, "list_del corruption. prev->next should be %p, but was %p\n",
> >>                       entry, prev->next);
> >> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
> >>               return false;
> >>       }
> >>       if (unlikely(next->prev != entry)) {
> >>               WARN(1, "list_del corruption. next->prev should be %p, but was %p\n",
> >>                       entry, next->prev);
> >> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
> >>               return false;
> >>       }
> >>       return true;
> >> --
> >> 2.7.4
> >>  
> >  
> 
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] bug: Provide toggle for BUG on data corruption
  2016-08-16 21:53       ` Steven Rostedt
@ 2016-08-16 21:57         ` Steven Rostedt
  2016-08-16 23:14           ` Kees Cook
  0 siblings, 1 reply; 18+ messages in thread
From: Steven Rostedt @ 2016-08-16 21:57 UTC (permalink / raw)
  To: Kees Cook
  Cc: Paul McKenney, Stephen Boyd, Daniel Micay, Arnd Bergmann,
	Greg Kroah-Hartman, Josh Triplett, Mathieu Desnoyers,
	Lai Jiangshan, Peter Zijlstra, Ingo Molnar, Tejun Heo,
	Michael Ellerman, Aneesh Kumar K.V, Kirill A. Shutemov,
	Andrew Morton, Dan Williams, Jan Kara, Josef Bacik,
	Thomas Gleixner, Andrey Ryabinin, Nikolay Aleksandrov,
	Dmitry Vyukov, LKML, kernel-hardening, Joe Perches

On Tue, 16 Aug 2016 17:53:54 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:


>              WARN(1, "list_del corruption. next->prev should be %p, but was %p\n",
>                      entry, next->prev);
>              BUG_ON(CORRUPTED_DATA_STRUCTURE);
> 
> Will always warn (as stated by "1") and and the BUG_ON() will bug if
> CORRUPTED_DATA_STRUCTURE is set. Although, I don't like that name. Can
> we have a:
> 
>  BUG_ON(BUG_ON_CORRUPTED_DATA_STRUCTURES);
> 
> Or maybe have that as a macro:
> 
> #ifdef CONFIG_BUG_ON_CORRUPTION
> # define BUG_ON_CORRUPTED_DATA_STRUCTURE() BUG_ON(1)
> #else
> # define BUG_ON_CORRUPTED_DATA_STRUCTURE() do {} while (0)
> #endif
> 
> Then we can have:
> 
>              WARN(1, "list_del corruption. next->prev should be %p, but was %p\n",
>                      entry, next->prev);
>              BUG_ON_CORRUPTED_DATA_STRUCTURE();
> 
> ??
> 

Hmm, maybe better yet, just have it called "CORRUPTED_DATA_STRUCTURE()"
because it wont bug if the config is not set, and having "BUG_ON" in
the name, it might be somewhat confusing.

-- Steve

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] bug: Provide toggle for BUG on data corruption
  2016-08-16 21:50   ` Laura Abbott
@ 2016-08-16 23:11     ` Kees Cook
  0 siblings, 0 replies; 18+ messages in thread
From: Kees Cook @ 2016-08-16 23:11 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Paul E . McKenney, Stephen Boyd, Daniel Micay, Arnd Bergmann,
	Greg Kroah-Hartman, Josh Triplett, Steven Rostedt,
	Mathieu Desnoyers, Lai Jiangshan, Peter Zijlstra, Ingo Molnar,
	Tejun Heo, Michael Ellerman, Aneesh Kumar K.V,
	Kirill A. Shutemov, Andrew Morton, Dan Williams, Jan Kara,
	Josef Bacik, Thomas Gleixner, Andrey Ryabinin,
	Nikolay Aleksandrov, Dmitry Vyukov, LKML, kernel-hardening

On Tue, Aug 16, 2016 at 2:50 PM, Laura Abbott <labbott@redhat.com> wrote:
> On 08/16/2016 02:11 PM, Kees Cook wrote:
>>
>> The kernel checks for several cases of data structure corruption under
>> either normal runtime, or under various CONFIG_DEBUG_* settings. When
>> corruption is detected, some systems may want to BUG() immediately instead
>> of letting the corruption continue. Many of these manipulation primitives
>> can be used by security flaws to gain arbitrary memory write control. This
>> provides CONFIG_BUG_ON_CORRUPTION to control newly added BUG() locations.
>>
>> This is inspired by similar hardening in PaX and Grsecurity, and by
>> Stephen Boyd in MSM kernels.
>>
>
> This was never my favorite patch in the MSM tree, mostly because it seemed
> to proliferate in random places. Some of these like the list were legit
> corruption but others like the spinlock and workqueue were indication of
> more hardware induced corruption and less kernel or software bugs.
> I'd rather see the detection added in structures specifically identified to
> be vulnerable. spinlocks and workqueues don't seem to be high on the
> list to me. If they are, I think this could use some explanation of why
> these in particular deserve checks vs. any other place where we might
> detect corruption.

Ah, interesting. I was wondering about why those two were added,
especially the workqueue one which seemed to be a counter issue
(hopefully we'll get the counter protection in soon too).

Unless Stephen speaks up, I'll just drop the spinlock and workqueue chunks.

>> Signed-off-by: Kees Cook <keescook@chromium.org>
>> ---
>>  include/linux/bug.h             |  7 +++++++
>>  kernel/locking/spinlock_debug.c |  1 +
>>  kernel/workqueue.c              |  2 ++
>>  lib/Kconfig.debug               | 10 ++++++++++
>>  lib/list_debug.c                |  7 +++++++
>>  5 files changed, 27 insertions(+)
>>
>> diff --git a/include/linux/bug.h b/include/linux/bug.h
>> index e51b0709e78d..7e69758dd798 100644
>> --- a/include/linux/bug.h
>> +++ b/include/linux/bug.h
>> @@ -118,4 +118,11 @@ static inline enum bug_trap_type report_bug(unsigned
>> long bug_addr,
>>  }
>>
>>  #endif /* CONFIG_GENERIC_BUG */
>> +
>> +#ifdef CONFIG_BUG_ON_CORRUPTION
>> +# define CORRUPTED_DATA_STRUCTURE      true
>> +#else
>> +# define CORRUPTED_DATA_STRUCTURE      false
>> +#endif
>> +
>>  #endif /* _LINUX_BUG_H */
>> diff --git a/kernel/locking/spinlock_debug.c
>> b/kernel/locking/spinlock_debug.c
>> index 0374a596cffa..d5f833769feb 100644
>> --- a/kernel/locking/spinlock_debug.c
>> +++ b/kernel/locking/spinlock_debug.c
>> @@ -64,6 +64,7 @@ static void spin_dump(raw_spinlock_t *lock, const char
>> *msg)
>>                 owner ? owner->comm : "<none>",
>>                 owner ? task_pid_nr(owner) : -1,
>>                 lock->owner_cpu);
>> +       BUG_ON(CORRUPTED_DATA_STRUCTURE);
>>         dump_stack();
>>  }
>>
>> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
>> index ef071ca73fc3..ea0132b55eca 100644
>> --- a/kernel/workqueue.c
>> +++ b/kernel/workqueue.c
>> @@ -48,6 +48,7 @@
>>  #include <linux/nodemask.h>
>>  #include <linux/moduleparam.h>
>>  #include <linux/uaccess.h>
>> +#include <linux/bug.h>
>>
>>  #include "workqueue_internal.h"
>>
>> @@ -2108,6 +2109,7 @@ __acquires(&pool->lock)
>>                        current->comm, preempt_count(),
>> task_pid_nr(current),
>>                        worker->current_func);
>>                 debug_show_held_locks(current);
>> +               BUG_ON(CORRUPTED_DATA_STRUCTURE);
>>                 dump_stack();
>>         }
>>
>> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
>> index 2307d7c89dac..d64bd6b6fd45 100644
>> --- a/lib/Kconfig.debug
>> +++ b/lib/Kconfig.debug
>> @@ -1987,6 +1987,16 @@ config TEST_STATIC_KEYS
>>
>>           If unsure, say N.
>>
>> +config BUG_ON_CORRUPTION
>> +       bool "Trigger a BUG when data corruption is detected"
>> +       help
>> +         Select this option if the kernel should BUG when it encounters
>> +         data corruption in various kernel memory structures during
>> checks
>> +         added by options like CONFIG_DEBUG_LIST, CONFIG_DEBUG_SPINLOCK,
>> +         etc.
>> +
>> +         If unsure, say N.
>> +
>>  source "samples/Kconfig"
>>
>>  source "lib/Kconfig.kgdb"
>> diff --git a/lib/list_debug.c b/lib/list_debug.c
>> index 80e2e40a6a4e..161c7e7d3478 100644
>> --- a/lib/list_debug.c
>> +++ b/lib/list_debug.c
>> @@ -26,16 +26,19 @@ bool __list_add_debug(struct list_head *new,
>>         if (unlikely(next->prev != prev)) {
>>                 WARN(1, "list_add corruption. next->prev should be prev
>> (%p), but was %p. (next=%p).\n",
>>                         prev, next->prev, next);
>> +               BUG_ON(CORRUPTED_DATA_STRUCTURE);
>>                 return false;
>>         }
>>         if (unlikely(prev->next != next)) {
>>                 WARN(1, "list_add corruption. prev->next should be next
>> (%p), but was %p. (prev=%p).\n",
>>                         next, prev->next, prev);
>> +               BUG_ON(CORRUPTED_DATA_STRUCTURE);
>>                 return false;
>>         }
>>         if (unlikely(new == prev || new == next)) {
>>                 WARN(1, "list_add double add: new=%p, prev=%p,
>> next=%p.\n",
>>                         new, prev, next);
>> +               BUG_ON(CORRUPTED_DATA_STRUCTURE);
>>                 return false;
>>         }
>>         return true;
>> @@ -52,21 +55,25 @@ bool __list_del_entry_debug(struct list_head *entry)
>>         if (unlikely(next == LIST_POISON1)) {
>>                 WARN(1, "list_del corruption, %p->next is LIST_POISON1
>> (%p)\n",
>>                         entry, LIST_POISON1);
>> +               BUG_ON(CORRUPTED_DATA_STRUCTURE);
>>                 return false;
>>         }
>>         if (unlikely(prev == LIST_POISON2)) {
>>                 WARN(1, "list_del corruption, %p->prev is LIST_POISON2
>> (%p)\n",
>>                         entry, LIST_POISON2);
>> +               BUG_ON(CORRUPTED_DATA_STRUCTURE);
>>                 return false;
>>         }
>>         if (unlikely(prev->next != entry)) {
>>                 WARN(1, "list_del corruption. prev->next should be %p, but
>> was %p\n",
>>                         entry, prev->next);
>> +               BUG_ON(CORRUPTED_DATA_STRUCTURE);
>>                 return false;
>>         }
>>         if (unlikely(next->prev != entry)) {
>>                 WARN(1, "list_del corruption. next->prev should be %p, but
>> was %p\n",
>>                         entry, next->prev);
>> +               BUG_ON(CORRUPTED_DATA_STRUCTURE);
>>                 return false;
>>         }
>>         return true;
>>
>
> The git history shows 924d9addb9b1 ("list debugging: use
> WARN() instead of BUG()") deliberately switched this from BUG to WARN.
> Do we still need the WARN at all or can we just switch to BUG permanently?

Hah. Yeah, okay, so that explains the history of this a little more.
Looks like when this was converted to WARN, the "stop execution flow"
part of that was not retained (which is what PaX/Grsecurity added
back). Regardless, it certainly supports having a CONFIG for "should
this WARN and abort, or BUG?"

-Kees

-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] bug: Provide toggle for BUG on data corruption
  2016-08-16 21:57         ` Steven Rostedt
@ 2016-08-16 23:14           ` Kees Cook
  0 siblings, 0 replies; 18+ messages in thread
From: Kees Cook @ 2016-08-16 23:14 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Paul McKenney, Stephen Boyd, Daniel Micay, Arnd Bergmann,
	Greg Kroah-Hartman, Josh Triplett, Mathieu Desnoyers,
	Lai Jiangshan, Peter Zijlstra, Ingo Molnar, Tejun Heo,
	Michael Ellerman, Aneesh Kumar K.V, Kirill A. Shutemov,
	Andrew Morton, Dan Williams, Jan Kara, Josef Bacik,
	Thomas Gleixner, Andrey Ryabinin, Nikolay Aleksandrov,
	Dmitry Vyukov, LKML, kernel-hardening, Joe Perches

On Tue, Aug 16, 2016 at 2:57 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Tue, 16 Aug 2016 17:53:54 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
>
>
>>              WARN(1, "list_del corruption. next->prev should be %p, but was %p\n",
>>                      entry, next->prev);
>>              BUG_ON(CORRUPTED_DATA_STRUCTURE);
>>
>> Will always warn (as stated by "1") and and the BUG_ON() will bug if
>> CORRUPTED_DATA_STRUCTURE is set. Although, I don't like that name. Can
>> we have a:
>>
>>  BUG_ON(BUG_ON_CORRUPTED_DATA_STRUCTURES);
>>
>> Or maybe have that as a macro:
>>
>> #ifdef CONFIG_BUG_ON_CORRUPTION
>> # define BUG_ON_CORRUPTED_DATA_STRUCTURE() BUG_ON(1)
>> #else
>> # define BUG_ON_CORRUPTED_DATA_STRUCTURE() do {} while (0)
>> #endif
>>
>> Then we can have:
>>
>>              WARN(1, "list_del corruption. next->prev should be %p, but was %p\n",
>>                      entry, next->prev);
>>              BUG_ON_CORRUPTED_DATA_STRUCTURE();
>>
>> ??
>>
>
> Hmm, maybe better yet, just have it called "CORRUPTED_DATA_STRUCTURE()"
> because it wont bug if the config is not set, and having "BUG_ON" in
> the name, it might be somewhat confusing.

Yeah, I'm trying to redesign this now, since one thing I think is
important to build into the new macro is the concept of _stopping_
execution. i.e. even if you don't want to BUG, you really don't want
to operate on the busted data structure. This protection was precisely
what went missing with commit 924d9addb9b1.

-Kees

-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] bug: Provide toggle for BUG on data corruption
  2016-08-16 21:42     ` Kees Cook
  2016-08-16 21:53       ` Steven Rostedt
@ 2016-08-17  0:01       ` Paul E. McKenney
  2016-08-17  0:09         ` Kees Cook
  1 sibling, 1 reply; 18+ messages in thread
From: Paul E. McKenney @ 2016-08-17  0:01 UTC (permalink / raw)
  To: Kees Cook
  Cc: Stephen Boyd, Daniel Micay, Arnd Bergmann, Greg Kroah-Hartman,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Peter Zijlstra, Ingo Molnar, Tejun Heo, Michael Ellerman,
	Aneesh Kumar K.V, Kirill A. Shutemov, Andrew Morton,
	Dan Williams, Jan Kara, Josef Bacik, Thomas Gleixner,
	Andrey Ryabinin, Nikolay Aleksandrov, Dmitry Vyukov, LKML,
	kernel-hardening, Joe Perches

On Tue, Aug 16, 2016 at 02:42:28PM -0700, Kees Cook wrote:
> On Tue, Aug 16, 2016 at 2:26 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Tue, Aug 16, 2016 at 02:11:04PM -0700, Kees Cook wrote:
> >> The kernel checks for several cases of data structure corruption under
> >> either normal runtime, or under various CONFIG_DEBUG_* settings. When
> >> corruption is detected, some systems may want to BUG() immediately instead
> >> of letting the corruption continue. Many of these manipulation primitives
> >> can be used by security flaws to gain arbitrary memory write control. This
> >> provides CONFIG_BUG_ON_CORRUPTION to control newly added BUG() locations.
> >>
> >> This is inspired by similar hardening in PaX and Grsecurity, and by
> >> Stephen Boyd in MSM kernels.
> >>
> >> Signed-off-by: Kees Cook <keescook@chromium.org>
> >
> > OK, I will bite...  Why both the WARN() and the BUG_ON()?
> 
> Mostly because not every case of BUG(CORRUPTED_DATA_STRUCTURE) is
> cleanly paired with a WARN (see the workqueue addition that wants to
> dump locks too). I could rearrange things a bit, though, and create
> something like:
> 
> #ifdef CONFIG_BUG_ON_CORRUPTION
> # define CORRUPTED(format...) { \
>     printk(KERN_ERR format); \
>     BUG(); \
> }
> #else
> # define CORRUPTED(format...) WARN(1, format)
> #endif
> 
> What do you think?

First let me see if I understand the rationale...  The idea is to allow
those in security-irrelevant environments, such as test systems, to
have the old "complain but soldier on" semantics, while security-conscious
systems just panic, thereby hopefully converting a more dangerous form
of attack into a denial-of-service attack.

An alternative approach would be to make WARN() panic on systems built
with CONFIG_BUG_ON_CORRUPTION, but we might instead want to choose which
warnings are fatal on security-conscious systems.

Or am I missing the point?

At a more detailed level, one could argue for something like this:

#define CORRUPTED(format...) \
do { \
	if (IS_ENABLED(CONFIG_BUG_ON_CORRUPTION)) { \
		printk(KERN_ERR format); \
		BUG(); \
	} else { \
		WARN(1, format); \
	} \
} while (0)

Some might prefer #ifdef to IS_ENABLED(), but the bare {} needs to
be do-while in any case.

							Thanx, Paul

> -Kees
> 
> >
> >                                                                 Thanx, Paul
> >
> >> ---
> >>  include/linux/bug.h             |  7 +++++++
> >>  kernel/locking/spinlock_debug.c |  1 +
> >>  kernel/workqueue.c              |  2 ++
> >>  lib/Kconfig.debug               | 10 ++++++++++
> >>  lib/list_debug.c                |  7 +++++++
> >>  5 files changed, 27 insertions(+)
> >>
> >> diff --git a/include/linux/bug.h b/include/linux/bug.h
> >> index e51b0709e78d..7e69758dd798 100644
> >> --- a/include/linux/bug.h
> >> +++ b/include/linux/bug.h
> >> @@ -118,4 +118,11 @@ static inline enum bug_trap_type report_bug(unsigned long bug_addr,
> >>  }
> >>
> >>  #endif       /* CONFIG_GENERIC_BUG */
> >> +
> >> +#ifdef CONFIG_BUG_ON_CORRUPTION
> >> +# define CORRUPTED_DATA_STRUCTURE    true
> >> +#else
> >> +# define CORRUPTED_DATA_STRUCTURE    false
> >> +#endif
> >> +
> >>  #endif       /* _LINUX_BUG_H */
> >> diff --git a/kernel/locking/spinlock_debug.c b/kernel/locking/spinlock_debug.c
> >> index 0374a596cffa..d5f833769feb 100644
> >> --- a/kernel/locking/spinlock_debug.c
> >> +++ b/kernel/locking/spinlock_debug.c
> >> @@ -64,6 +64,7 @@ static void spin_dump(raw_spinlock_t *lock, const char *msg)
> >>               owner ? owner->comm : "<none>",
> >>               owner ? task_pid_nr(owner) : -1,
> >>               lock->owner_cpu);
> >> +     BUG_ON(CORRUPTED_DATA_STRUCTURE);
> >>       dump_stack();
> >>  }
> >>
> >> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> >> index ef071ca73fc3..ea0132b55eca 100644
> >> --- a/kernel/workqueue.c
> >> +++ b/kernel/workqueue.c
> >> @@ -48,6 +48,7 @@
> >>  #include <linux/nodemask.h>
> >>  #include <linux/moduleparam.h>
> >>  #include <linux/uaccess.h>
> >> +#include <linux/bug.h>
> >>
> >>  #include "workqueue_internal.h"
> >>
> >> @@ -2108,6 +2109,7 @@ __acquires(&pool->lock)
> >>                      current->comm, preempt_count(), task_pid_nr(current),
> >>                      worker->current_func);
> >>               debug_show_held_locks(current);
> >> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
> >>               dump_stack();
> >>       }
> >>
> >> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> >> index 2307d7c89dac..d64bd6b6fd45 100644
> >> --- a/lib/Kconfig.debug
> >> +++ b/lib/Kconfig.debug
> >> @@ -1987,6 +1987,16 @@ config TEST_STATIC_KEYS
> >>
> >>         If unsure, say N.
> >>
> >> +config BUG_ON_CORRUPTION
> >> +     bool "Trigger a BUG when data corruption is detected"
> >> +     help
> >> +       Select this option if the kernel should BUG when it encounters
> >> +       data corruption in various kernel memory structures during checks
> >> +       added by options like CONFIG_DEBUG_LIST, CONFIG_DEBUG_SPINLOCK,
> >> +       etc.
> >> +
> >> +       If unsure, say N.
> >> +
> >>  source "samples/Kconfig"
> >>
> >>  source "lib/Kconfig.kgdb"
> >> diff --git a/lib/list_debug.c b/lib/list_debug.c
> >> index 80e2e40a6a4e..161c7e7d3478 100644
> >> --- a/lib/list_debug.c
> >> +++ b/lib/list_debug.c
> >> @@ -26,16 +26,19 @@ bool __list_add_debug(struct list_head *new,
> >>       if (unlikely(next->prev != prev)) {
> >>               WARN(1, "list_add corruption. next->prev should be prev (%p), but was %p. (next=%p).\n",
> >>                       prev, next->prev, next);
> >> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
> >>               return false;
> >>       }
> >>       if (unlikely(prev->next != next)) {
> >>               WARN(1, "list_add corruption. prev->next should be next (%p), but was %p. (prev=%p).\n",
> >>                       next, prev->next, prev);
> >> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
> >>               return false;
> >>       }
> >>       if (unlikely(new == prev || new == next)) {
> >>               WARN(1, "list_add double add: new=%p, prev=%p, next=%p.\n",
> >>                       new, prev, next);
> >> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
> >>               return false;
> >>       }
> >>       return true;
> >> @@ -52,21 +55,25 @@ bool __list_del_entry_debug(struct list_head *entry)
> >>       if (unlikely(next == LIST_POISON1)) {
> >>               WARN(1, "list_del corruption, %p->next is LIST_POISON1 (%p)\n",
> >>                       entry, LIST_POISON1);
> >> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
> >>               return false;
> >>       }
> >>       if (unlikely(prev == LIST_POISON2)) {
> >>               WARN(1, "list_del corruption, %p->prev is LIST_POISON2 (%p)\n",
> >>                       entry, LIST_POISON2);
> >> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
> >>               return false;
> >>       }
> >>       if (unlikely(prev->next != entry)) {
> >>               WARN(1, "list_del corruption. prev->next should be %p, but was %p\n",
> >>                       entry, prev->next);
> >> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
> >>               return false;
> >>       }
> >>       if (unlikely(next->prev != entry)) {
> >>               WARN(1, "list_del corruption. next->prev should be %p, but was %p\n",
> >>                       entry, next->prev);
> >> +             BUG_ON(CORRUPTED_DATA_STRUCTURE);
> >>               return false;
> >>       }
> >>       return true;
> >> --
> >> 2.7.4
> >>
> >
> 
> 
> 
> -- 
> Kees Cook
> Nexus Security
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] bug: Provide toggle for BUG on data corruption
  2016-08-17  0:01       ` Paul E. McKenney
@ 2016-08-17  0:09         ` Kees Cook
  2016-08-17 16:09           ` Paul E. McKenney
  0 siblings, 1 reply; 18+ messages in thread
From: Kees Cook @ 2016-08-17  0:09 UTC (permalink / raw)
  To: Paul McKenney
  Cc: Stephen Boyd, Daniel Micay, Arnd Bergmann, Greg Kroah-Hartman,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Peter Zijlstra, Ingo Molnar, Tejun Heo, Michael Ellerman,
	Aneesh Kumar K.V, Kirill A. Shutemov, Andrew Morton,
	Dan Williams, Jan Kara, Josef Bacik, Thomas Gleixner,
	Andrey Ryabinin, Nikolay Aleksandrov, Dmitry Vyukov, LKML,
	kernel-hardening, Joe Perches

On Tue, Aug 16, 2016 at 5:01 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Tue, Aug 16, 2016 at 02:42:28PM -0700, Kees Cook wrote:
>> On Tue, Aug 16, 2016 at 2:26 PM, Paul E. McKenney
>> <paulmck@linux.vnet.ibm.com> wrote:
>> > On Tue, Aug 16, 2016 at 02:11:04PM -0700, Kees Cook wrote:
>> >> The kernel checks for several cases of data structure corruption under
>> >> either normal runtime, or under various CONFIG_DEBUG_* settings. When
>> >> corruption is detected, some systems may want to BUG() immediately instead
>> >> of letting the corruption continue. Many of these manipulation primitives
>> >> can be used by security flaws to gain arbitrary memory write control. This
>> >> provides CONFIG_BUG_ON_CORRUPTION to control newly added BUG() locations.
>> >>
>> >> This is inspired by similar hardening in PaX and Grsecurity, and by
>> >> Stephen Boyd in MSM kernels.
>> >>
>> >> Signed-off-by: Kees Cook <keescook@chromium.org>
>> >
>> > OK, I will bite...  Why both the WARN() and the BUG_ON()?
>>
>> Mostly because not every case of BUG(CORRUPTED_DATA_STRUCTURE) is
>> cleanly paired with a WARN (see the workqueue addition that wants to
>> dump locks too). I could rearrange things a bit, though, and create
>> something like:
>>
>> #ifdef CONFIG_BUG_ON_CORRUPTION
>> # define CORRUPTED(format...) { \
>>     printk(KERN_ERR format); \
>>     BUG(); \
>> }
>> #else
>> # define CORRUPTED(format...) WARN(1, format)
>> #endif
>>
>> What do you think?
>
> First let me see if I understand the rationale...  The idea is to allow
> those in security-irrelevant environments, such as test systems, to
> have the old "complain but soldier on" semantics, while security-conscious
> systems just panic, thereby hopefully converting a more dangerous form
> of attack into a denial-of-service attack.

Right, we don't want to wholesale upgrade all WARNs to BUGs. Just any
security-sensitive conditionals. And based on Laura's feedback, this
is really just about CONFIG_DEBUG_LIST now. We'll likely find some
more to add this to, but for the moment, we'll focus on list.

Here are the rationales as I see it:

- if you've enabled CONFIG_DEBUG_LIST
  - you want to get a report of the corruption
  - you want the kernel to _not operate on the structure_ (this went
missing when s/BUG/WARN/)
- if you've enabled CONFIG_BUG_ON_DATA_CORRUPTION
  - everything from CONFIG_DEBUG_LIST
  - you want the offending process to go away (i.e. BUG instead of WARN)
  - you may want the entire system to dump if you've set the
panic_on_oops sysctl (i.e. BUG)

> An alternative approach would be to make WARN() panic on systems built
> with CONFIG_BUG_ON_CORRUPTION, but we might instead want to choose which
> warnings are fatal on security-conscious systems.
>
> Or am I missing the point?
>
> At a more detailed level, one could argue for something like this:
>
> #define CORRUPTED(format...) \
> do { \
>         if (IS_ENABLED(CONFIG_BUG_ON_CORRUPTION)) { \
>                 printk(KERN_ERR format); \
>                 BUG(); \
>         } else { \
>                 WARN(1, format); \
>         } \
> } while (0)
>
> Some might prefer #ifdef to IS_ENABLED(), but the bare {} needs to
> be do-while in any case.

Yup, this is almost exactly what I've got in the v2. I wanted to
enforce a control-flow side-effect, though, so I've included a
non-optional "return false" as well.

-Kees

-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] bug: Provide toggle for BUG on data corruption
  2016-08-17  0:09         ` Kees Cook
@ 2016-08-17 16:09           ` Paul E. McKenney
  2016-08-17 16:14             ` Kees Cook
  0 siblings, 1 reply; 18+ messages in thread
From: Paul E. McKenney @ 2016-08-17 16:09 UTC (permalink / raw)
  To: Kees Cook
  Cc: Stephen Boyd, Daniel Micay, Arnd Bergmann, Greg Kroah-Hartman,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Peter Zijlstra, Ingo Molnar, Tejun Heo, Michael Ellerman,
	Aneesh Kumar K.V, Kirill A. Shutemov, Andrew Morton,
	Dan Williams, Jan Kara, Josef Bacik, Thomas Gleixner,
	Andrey Ryabinin, Nikolay Aleksandrov, Dmitry Vyukov, LKML,
	kernel-hardening, Joe Perches

On Tue, Aug 16, 2016 at 05:09:53PM -0700, Kees Cook wrote:
> On Tue, Aug 16, 2016 at 5:01 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Tue, Aug 16, 2016 at 02:42:28PM -0700, Kees Cook wrote:
> >> On Tue, Aug 16, 2016 at 2:26 PM, Paul E. McKenney
> >> <paulmck@linux.vnet.ibm.com> wrote:
> >> > On Tue, Aug 16, 2016 at 02:11:04PM -0700, Kees Cook wrote:
> >> >> The kernel checks for several cases of data structure corruption under
> >> >> either normal runtime, or under various CONFIG_DEBUG_* settings. When
> >> >> corruption is detected, some systems may want to BUG() immediately instead
> >> >> of letting the corruption continue. Many of these manipulation primitives
> >> >> can be used by security flaws to gain arbitrary memory write control. This
> >> >> provides CONFIG_BUG_ON_CORRUPTION to control newly added BUG() locations.
> >> >>
> >> >> This is inspired by similar hardening in PaX and Grsecurity, and by
> >> >> Stephen Boyd in MSM kernels.
> >> >>
> >> >> Signed-off-by: Kees Cook <keescook@chromium.org>
> >> >
> >> > OK, I will bite...  Why both the WARN() and the BUG_ON()?
> >>
> >> Mostly because not every case of BUG(CORRUPTED_DATA_STRUCTURE) is
> >> cleanly paired with a WARN (see the workqueue addition that wants to
> >> dump locks too). I could rearrange things a bit, though, and create
> >> something like:
> >>
> >> #ifdef CONFIG_BUG_ON_CORRUPTION
> >> # define CORRUPTED(format...) { \
> >>     printk(KERN_ERR format); \
> >>     BUG(); \
> >> }
> >> #else
> >> # define CORRUPTED(format...) WARN(1, format)
> >> #endif
> >>
> >> What do you think?
> >
> > First let me see if I understand the rationale...  The idea is to allow
> > those in security-irrelevant environments, such as test systems, to
> > have the old "complain but soldier on" semantics, while security-conscious
> > systems just panic, thereby hopefully converting a more dangerous form
> > of attack into a denial-of-service attack.
> 
> Right, we don't want to wholesale upgrade all WARNs to BUGs. Just any
> security-sensitive conditionals. And based on Laura's feedback, this
> is really just about CONFIG_DEBUG_LIST now. We'll likely find some
> more to add this to, but for the moment, we'll focus on list.
> 
> Here are the rationales as I see it:
> 
> - if you've enabled CONFIG_DEBUG_LIST
>   - you want to get a report of the corruption
>   - you want the kernel to _not operate on the structure_ (this went
> missing when s/BUG/WARN/)
> - if you've enabled CONFIG_BUG_ON_DATA_CORRUPTION
>   - everything from CONFIG_DEBUG_LIST
>   - you want the offending process to go away (i.e. BUG instead of WARN)
>   - you may want the entire system to dump if you've set the
> panic_on_oops sysctl (i.e. BUG)

OK, this looks good to me.

Just to be clear, if you've enabled neither CONFIG_DEBUG_LIST nor
CONFIG_BUG_ON_DATA_CORRUPTION, then you get better performance, but are
taking responsibility for using some other means of shielding your system
from attack, correct?  (I believe that we do need to give the user this
choice, just checking your intent.)

> > An alternative approach would be to make WARN() panic on systems built
> > with CONFIG_BUG_ON_CORRUPTION, but we might instead want to choose which
> > warnings are fatal on security-conscious systems.
> >
> > Or am I missing the point?
> >
> > At a more detailed level, one could argue for something like this:
> >
> > #define CORRUPTED(format...) \
> > do { \
> >         if (IS_ENABLED(CONFIG_BUG_ON_CORRUPTION)) { \
> >                 printk(KERN_ERR format); \
> >                 BUG(); \
> >         } else { \
> >                 WARN(1, format); \
> >         } \
> > } while (0)
> >
> > Some might prefer #ifdef to IS_ENABLED(), but the bare {} needs to
> > be do-while in any case.
> 
> Yup, this is almost exactly what I've got in the v2. I wanted to
> enforce a control-flow side-effect, though, so I've included a
> non-optional "return false" as well.

Sounds good!  And yes, pulling the condition in makes a lot of sense
to me as well.  Looking forward to seeing v3.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] bug: Provide toggle for BUG on data corruption
  2016-08-17 16:09           ` Paul E. McKenney
@ 2016-08-17 16:14             ` Kees Cook
  2016-08-17 16:32               ` Paul E. McKenney
  0 siblings, 1 reply; 18+ messages in thread
From: Kees Cook @ 2016-08-17 16:14 UTC (permalink / raw)
  To: Paul McKenney
  Cc: Stephen Boyd, Daniel Micay, Arnd Bergmann, Greg Kroah-Hartman,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Peter Zijlstra, Ingo Molnar, Tejun Heo, Michael Ellerman,
	Aneesh Kumar K.V, Kirill A. Shutemov, Andrew Morton,
	Dan Williams, Jan Kara, Josef Bacik, Thomas Gleixner,
	Andrey Ryabinin, Nikolay Aleksandrov, Dmitry Vyukov, LKML,
	kernel-hardening, Joe Perches

On Wed, Aug 17, 2016 at 9:09 AM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Tue, Aug 16, 2016 at 05:09:53PM -0700, Kees Cook wrote:
>> On Tue, Aug 16, 2016 at 5:01 PM, Paul E. McKenney
>> <paulmck@linux.vnet.ibm.com> wrote:
>> > On Tue, Aug 16, 2016 at 02:42:28PM -0700, Kees Cook wrote:
>> >> On Tue, Aug 16, 2016 at 2:26 PM, Paul E. McKenney
>> >> <paulmck@linux.vnet.ibm.com> wrote:
>> >> > On Tue, Aug 16, 2016 at 02:11:04PM -0700, Kees Cook wrote:
>> >> >> The kernel checks for several cases of data structure corruption under
>> >> >> either normal runtime, or under various CONFIG_DEBUG_* settings. When
>> >> >> corruption is detected, some systems may want to BUG() immediately instead
>> >> >> of letting the corruption continue. Many of these manipulation primitives
>> >> >> can be used by security flaws to gain arbitrary memory write control. This
>> >> >> provides CONFIG_BUG_ON_CORRUPTION to control newly added BUG() locations.
>> >> >>
>> >> >> This is inspired by similar hardening in PaX and Grsecurity, and by
>> >> >> Stephen Boyd in MSM kernels.
>> >> >>
>> >> >> Signed-off-by: Kees Cook <keescook@chromium.org>
>> >> >
>> >> > OK, I will bite...  Why both the WARN() and the BUG_ON()?
>> >>
>> >> Mostly because not every case of BUG(CORRUPTED_DATA_STRUCTURE) is
>> >> cleanly paired with a WARN (see the workqueue addition that wants to
>> >> dump locks too). I could rearrange things a bit, though, and create
>> >> something like:
>> >>
>> >> #ifdef CONFIG_BUG_ON_CORRUPTION
>> >> # define CORRUPTED(format...) { \
>> >>     printk(KERN_ERR format); \
>> >>     BUG(); \
>> >> }
>> >> #else
>> >> # define CORRUPTED(format...) WARN(1, format)
>> >> #endif
>> >>
>> >> What do you think?
>> >
>> > First let me see if I understand the rationale...  The idea is to allow
>> > those in security-irrelevant environments, such as test systems, to
>> > have the old "complain but soldier on" semantics, while security-conscious
>> > systems just panic, thereby hopefully converting a more dangerous form
>> > of attack into a denial-of-service attack.
>>
>> Right, we don't want to wholesale upgrade all WARNs to BUGs. Just any
>> security-sensitive conditionals. And based on Laura's feedback, this
>> is really just about CONFIG_DEBUG_LIST now. We'll likely find some
>> more to add this to, but for the moment, we'll focus on list.
>>
>> Here are the rationales as I see it:
>>
>> - if you've enabled CONFIG_DEBUG_LIST
>>   - you want to get a report of the corruption
>>   - you want the kernel to _not operate on the structure_ (this went
>> missing when s/BUG/WARN/)
>> - if you've enabled CONFIG_BUG_ON_DATA_CORRUPTION
>>   - everything from CONFIG_DEBUG_LIST
>>   - you want the offending process to go away (i.e. BUG instead of WARN)
>>   - you may want the entire system to dump if you've set the
>> panic_on_oops sysctl (i.e. BUG)
>
> OK, this looks good to me.
>
> Just to be clear, if you've enabled neither CONFIG_DEBUG_LIST nor
> CONFIG_BUG_ON_DATA_CORRUPTION, then you get better performance, but are
> taking responsibility for using some other means of shielding your system
> from attack, correct?  (I believe that we do need to give the user this
> choice, just checking your intent.)

That's correct. (And in the future I intend to prove that the
performance impact is comically small and that DEBUG_LIST should be
yes-by-default, but that'll be a whole separate issue.)

>> > An alternative approach would be to make WARN() panic on systems built
>> > with CONFIG_BUG_ON_CORRUPTION, but we might instead want to choose which
>> > warnings are fatal on security-conscious systems.
>> >
>> > Or am I missing the point?
>> >
>> > At a more detailed level, one could argue for something like this:
>> >
>> > #define CORRUPTED(format...) \
>> > do { \
>> >         if (IS_ENABLED(CONFIG_BUG_ON_CORRUPTION)) { \
>> >                 printk(KERN_ERR format); \
>> >                 BUG(); \
>> >         } else { \
>> >                 WARN(1, format); \
>> >         } \
>> > } while (0)
>> >
>> > Some might prefer #ifdef to IS_ENABLED(), but the bare {} needs to
>> > be do-while in any case.
>>
>> Yup, this is almost exactly what I've got in the v2. I wanted to
>> enforce a control-flow side-effect, though, so I've included a
>> non-optional "return false" as well.
>
> Sounds good!  And yes, pulling the condition in makes a lot of sense
> to me as well.  Looking forward to seeing v3.

Great, thanks!

-Kees

-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/5] bug: Provide toggle for BUG on data corruption
  2016-08-17 16:14             ` Kees Cook
@ 2016-08-17 16:32               ` Paul E. McKenney
  0 siblings, 0 replies; 18+ messages in thread
From: Paul E. McKenney @ 2016-08-17 16:32 UTC (permalink / raw)
  To: Kees Cook
  Cc: Stephen Boyd, Daniel Micay, Arnd Bergmann, Greg Kroah-Hartman,
	Josh Triplett, Steven Rostedt, Mathieu Desnoyers, Lai Jiangshan,
	Peter Zijlstra, Ingo Molnar, Tejun Heo, Michael Ellerman,
	Aneesh Kumar K.V, Kirill A. Shutemov, Andrew Morton,
	Dan Williams, Jan Kara, Josef Bacik, Thomas Gleixner,
	Andrey Ryabinin, Nikolay Aleksandrov, Dmitry Vyukov, LKML,
	kernel-hardening, Joe Perches

On Wed, Aug 17, 2016 at 09:14:59AM -0700, Kees Cook wrote:
> On Wed, Aug 17, 2016 at 9:09 AM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Tue, Aug 16, 2016 at 05:09:53PM -0700, Kees Cook wrote:
> >> On Tue, Aug 16, 2016 at 5:01 PM, Paul E. McKenney
> >> <paulmck@linux.vnet.ibm.com> wrote:
> >> > On Tue, Aug 16, 2016 at 02:42:28PM -0700, Kees Cook wrote:
> >> >> On Tue, Aug 16, 2016 at 2:26 PM, Paul E. McKenney
> >> >> <paulmck@linux.vnet.ibm.com> wrote:
> >> >> > On Tue, Aug 16, 2016 at 02:11:04PM -0700, Kees Cook wrote:
> >> >> >> The kernel checks for several cases of data structure corruption under
> >> >> >> either normal runtime, or under various CONFIG_DEBUG_* settings. When
> >> >> >> corruption is detected, some systems may want to BUG() immediately instead
> >> >> >> of letting the corruption continue. Many of these manipulation primitives
> >> >> >> can be used by security flaws to gain arbitrary memory write control. This
> >> >> >> provides CONFIG_BUG_ON_CORRUPTION to control newly added BUG() locations.
> >> >> >>
> >> >> >> This is inspired by similar hardening in PaX and Grsecurity, and by
> >> >> >> Stephen Boyd in MSM kernels.
> >> >> >>
> >> >> >> Signed-off-by: Kees Cook <keescook@chromium.org>
> >> >> >
> >> >> > OK, I will bite...  Why both the WARN() and the BUG_ON()?
> >> >>
> >> >> Mostly because not every case of BUG(CORRUPTED_DATA_STRUCTURE) is
> >> >> cleanly paired with a WARN (see the workqueue addition that wants to
> >> >> dump locks too). I could rearrange things a bit, though, and create
> >> >> something like:
> >> >>
> >> >> #ifdef CONFIG_BUG_ON_CORRUPTION
> >> >> # define CORRUPTED(format...) { \
> >> >>     printk(KERN_ERR format); \
> >> >>     BUG(); \
> >> >> }
> >> >> #else
> >> >> # define CORRUPTED(format...) WARN(1, format)
> >> >> #endif
> >> >>
> >> >> What do you think?
> >> >
> >> > First let me see if I understand the rationale...  The idea is to allow
> >> > those in security-irrelevant environments, such as test systems, to
> >> > have the old "complain but soldier on" semantics, while security-conscious
> >> > systems just panic, thereby hopefully converting a more dangerous form
> >> > of attack into a denial-of-service attack.
> >>
> >> Right, we don't want to wholesale upgrade all WARNs to BUGs. Just any
> >> security-sensitive conditionals. And based on Laura's feedback, this
> >> is really just about CONFIG_DEBUG_LIST now. We'll likely find some
> >> more to add this to, but for the moment, we'll focus on list.
> >>
> >> Here are the rationales as I see it:
> >>
> >> - if you've enabled CONFIG_DEBUG_LIST
> >>   - you want to get a report of the corruption
> >>   - you want the kernel to _not operate on the structure_ (this went
> >> missing when s/BUG/WARN/)
> >> - if you've enabled CONFIG_BUG_ON_DATA_CORRUPTION
> >>   - everything from CONFIG_DEBUG_LIST
> >>   - you want the offending process to go away (i.e. BUG instead of WARN)
> >>   - you may want the entire system to dump if you've set the
> >> panic_on_oops sysctl (i.e. BUG)
> >
> > OK, this looks good to me.
> >
> > Just to be clear, if you've enabled neither CONFIG_DEBUG_LIST nor
> > CONFIG_BUG_ON_DATA_CORRUPTION, then you get better performance, but are
> > taking responsibility for using some other means of shielding your system
> > from attack, correct?  (I believe that we do need to give the user this
> > choice, just checking your intent.)
> 
> That's correct. (And in the future I intend to prove that the
> performance impact is comically small and that DEBUG_LIST should be
> yes-by-default, but that'll be a whole separate issue.)

And I am sure that will prove to be the case.  But the people looking to
squeeze every last drop of performance from their system will cheerfully
ignore such results on the grounds that eliminating several thousand
such insignificant checks -might- have a useful overall benefit.  And
who knows, they might be right.

							Thanx, Paul

> >> > An alternative approach would be to make WARN() panic on systems built
> >> > with CONFIG_BUG_ON_CORRUPTION, but we might instead want to choose which
> >> > warnings are fatal on security-conscious systems.
> >> >
> >> > Or am I missing the point?
> >> >
> >> > At a more detailed level, one could argue for something like this:
> >> >
> >> > #define CORRUPTED(format...) \
> >> > do { \
> >> >         if (IS_ENABLED(CONFIG_BUG_ON_CORRUPTION)) { \
> >> >                 printk(KERN_ERR format); \
> >> >                 BUG(); \
> >> >         } else { \
> >> >                 WARN(1, format); \
> >> >         } \
> >> > } while (0)
> >> >
> >> > Some might prefer #ifdef to IS_ENABLED(), but the bare {} needs to
> >> > be do-while in any case.
> >>
> >> Yup, this is almost exactly what I've got in the v2. I wanted to
> >> enforce a control-flow side-effect, though, so I've included a
> >> non-optional "return false" as well.
> >
> > Sounds good!  And yes, pulling the condition in makes a lot of sense
> > to me as well.  Looking forward to seeing v3.
> 
> Great, thanks!
> 
> -Kees
> 
> -- 
> Kees Cook
> Nexus Security
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-08-17 16:32 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-16 21:11 [PATCH 0/5] bug: Provide toggle for BUG on data corruption Kees Cook
2016-08-16 21:11 ` [PATCH 1/5] list: Split list_add() debug checking into separate function Kees Cook
2016-08-16 21:11 ` [PATCH 2/5] rculist: Consolidate DEBUG_LIST for list_add_rcu() Kees Cook
2016-08-16 21:11 ` [PATCH 3/5] list: Split list_del() debug checking into separate function Kees Cook
2016-08-16 21:11 ` [PATCH 4/5] bug: Provide toggle for BUG on data corruption Kees Cook
2016-08-16 21:26   ` Paul E. McKenney
2016-08-16 21:42     ` Kees Cook
2016-08-16 21:53       ` Steven Rostedt
2016-08-16 21:57         ` Steven Rostedt
2016-08-16 23:14           ` Kees Cook
2016-08-17  0:01       ` Paul E. McKenney
2016-08-17  0:09         ` Kees Cook
2016-08-17 16:09           ` Paul E. McKenney
2016-08-17 16:14             ` Kees Cook
2016-08-17 16:32               ` Paul E. McKenney
2016-08-16 21:50   ` Laura Abbott
2016-08-16 23:11     ` Kees Cook
2016-08-16 21:11 ` [PATCH 5/5] lkdtm: Add tests for struct list corruption Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).