linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/21] Improve list integrity checking
@ 2020-03-24 15:36 Will Deacon
  2020-03-24 15:36 ` [RFC PATCH 01/21] list: Remove hlist_unhashed_lockless() Will Deacon
                   ` (20 more replies)
  0 siblings, 21 replies; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

Hi all,

Linked lists appear extensively across the kernel tree in some form or
another because they are simple to use and offer efficient addition and
removal routines. Unfortunately, they are also an attractive target for
people exploiting memory corruption bugs since not only are linked lists
susceptible to use-after-free (UAF) issues when used with non-counted
references to objects [1], but the APIs can also be abused as powerful
stepping stones towards obtaining arbitrary read/write of kernel memory.

A concrete example with an excellent write-up is the exploitation of
CVE-2019-2215 ("Bad Binder"):

https://googleprojectzero.blogspot.com/2019/11/bad-binder-android-in-wild-exploit.html

Some other examples are CVE-2015-3636 ("pingpongroot") and a variant of
CVE-2019-2025 ("waterdrop").

Before you go and throw away all your computers: fear not! It is common for
distributions to enable CONFIG_DEBUG_LIST, which prevents many of these
attacks with very little runtime overhead by sanity checking node pointers
during add() and del() operations.

Anyway, that's some background. Initially, I just set out to rename
CONFIG_DEBUG_LIST to something less "debuggy", a bit like we did for
CONFIG_DEBUG_{RODATA,MODULE_RONX} in 0f5bf6d0afe4. However, I quickly ran
into some other issues:


  1. The KCSAN work has led to some unusual WRITE_ONCE() additions to
     the list code. These make it difficult to reason about which list
     functions can run concurrently with one another, which in turn makes
     it difficult to write sanity checks without locks.

     Patches 1-9 try to sort this out.

  2. 'hlist' is missing checks, despite being the structure targetted
     by the "pingpongroot" attack mentioned earlier on. The same goes
     for 'hlist_nulls'.

     Patches 11-13 add these checks.

  3. 'list_bl' performs some partial inline checking.

     Patches 16, 18 and 19 extend this checking and move it out-of-line
     to be consistent with the other list checking operations.

  4. The LKDTM list tests are fairly limited.

     Patch 21 adds tests for all of the new checks.


The other patches do the rename and generally tidy things up to keep the
different list implementations consistent with each other. Due to issue (1),
this series is based on tip/locking/kcsan [2]

Jann previously made a comment that renaming the CONFIG options could have
unexpected results for people using an old config. In fact, building the
'oldconfig' target will prompt you about the new options, so I don't think
this is a huge concern. I did try to find a way to leave the old options as
dummy symbols, but it just cluttered the Kconfig (a bit like the approach
taken by NETFILTER_XT_MATCH_CONNMARK).

Cheers,

Will

[1] https://lore.kernel.org/lkml/?q=%28%22KASAN%3A+use-after-free+in+__list_add_valid%22+OR+%22KASAN%3A+use-after-free+in+__list_del_entry_valid%22%29+-s%3A%22Re%3A%22+-s%3A%22%5BPATCH%22
[2] https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=locking/kcsan

Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org
Cc: Jann Horn <jannh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Maddie Stone <maddiestone@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-team@android.com
Cc: kernel-hardening@lists.openwall.com 

--->8

Will Deacon (21):
  list: Remove hlist_unhashed_lockless()
  list: Remove hlist_nulls_unhashed_lockless()
  list: Annotate lockless list primitives with data_race()
  timers: Use hlist_unhashed() instead of open-coding in timer_pending()
  list: Comment missing WRITE_ONCE() in __list_del()
  list: Remove superfluous WRITE_ONCE() from hlist_nulls implementation
  Revert "list: Use WRITE_ONCE() when adding to lists and hlists"
  Revert "list: Use WRITE_ONCE() when initializing list_head structures"
  list: Remove unnecessary WRITE_ONCE() from hlist_bl_add_before()
  kernel-hacking: Make DEBUG_{LIST,PLIST,SG,NOTIFIERS} non-debug options
  list: Add integrity checking to hlist implementation
  list: Poison ->next pointer for non-RCU deletion of 'hlist_nulls_node'
  list: Add integrity checking to hlist_nulls implementation
  plist: Use CHECK_DATA_CORRUPTION instead of explicit {BUG,WARN}_ON()
  list_bl: Use CHECK_DATA_CORRUPTION instead of custom BUG_ON() wrapper
  list_bl: Extend integrity checking in deletion routines
  linux/bit_spinlock.h: Include linux/processor.h
  list_bl: Move integrity checking out of line
  list_bl: Extend integrity checking to cover the same cases as 'hlist'
  list: Format CHECK_DATA_CORRUPTION error messages consistently
  lkdtm: Extend list corruption checks

 arch/arm/configs/tegra_defconfig              |   2 +-
 arch/mips/configs/bigsur_defconfig            |   2 +-
 arch/powerpc/configs/ppc6xx_defconfig         |   4 +-
 arch/powerpc/configs/ps3_defconfig            |   2 +-
 arch/powerpc/configs/skiroot_defconfig        |   4 +-
 arch/riscv/configs/defconfig                  |   6 +-
 arch/riscv/configs/rv32_defconfig             |   6 +-
 arch/s390/configs/debug_defconfig             |   4 +-
 arch/sh/configs/polaris_defconfig             |   2 +-
 arch/sh/configs/rsk7203_defconfig             |   4 +-
 arch/sh/configs/se7206_defconfig              |   2 +-
 drivers/gpu/drm/radeon/mkregtable.c           |   2 +-
 drivers/misc/lkdtm/Makefile                   |   1 +
 drivers/misc/lkdtm/bugs.c                     |  68 ---
 drivers/misc/lkdtm/core.c                     |  31 +-
 drivers/misc/lkdtm/list.c                     | 489 ++++++++++++++++++
 drivers/misc/lkdtm/lkdtm.h                    |  33 +-
 drivers/staging/wfx/fwio.c                    |   2 +-
 drivers/staging/wfx/hwio.c                    |   2 +-
 include/linux/bit_spinlock.h                  |   1 +
 include/linux/list.h                          |  95 ++--
 include/linux/list_bl.h                       |  52 +-
 include/linux/list_nulls.h                    |  50 +-
 include/linux/llist.h                         |   2 +-
 include/linux/plist.h                         |   4 +-
 include/linux/rculist.h                       |  80 +--
 include/linux/rculist_bl.h                    |  17 +-
 include/linux/scatterlist.h                   |   6 +-
 include/linux/timer.h                         |   5 +-
 kernel/notifier.c                             |   2 +-
 kernel/time/timer.c                           |   1 +
 lib/Kconfig.debug                             |  24 +-
 lib/Makefile                                  |   2 +-
 lib/list_debug.c                              | 200 ++++++-
 lib/plist.c                                   |  67 ++-
 tools/include/linux/list.h                    |   4 +-
 tools/testing/selftests/lkdtm/tests.txt       |  31 +-
 .../selftests/wireguard/qemu/debug.config     |   6 +-
 tools/virtio/linux/scatterlist.h              |   4 +-
 39 files changed, 1034 insertions(+), 285 deletions(-)
 create mode 100644 drivers/misc/lkdtm/list.c

-- 
2.20.1

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [RFC PATCH 01/21] list: Remove hlist_unhashed_lockless()
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-24 16:27   ` Greg KH
  2020-03-30 23:05   ` Paul E. McKenney
  2020-03-24 15:36 ` [RFC PATCH 02/21] list: Remove hlist_nulls_unhashed_lockless() Will Deacon
                   ` (19 subsequent siblings)
  20 siblings, 2 replies; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

Commit c54a2744497d ("list: Add hlist_unhashed_lockless()") added a
"lockless" variant of hlist_unhashed() that uses READ_ONCE() to access
the 'pprev' pointer of the 'hlist_node', the intention being that this
could be used by 'timer_pending()' to silence a KCSAN warning. As well
as forgetting to add the caller, the patch also sprinkles
{READ,WRITE}_ONCE() invocations over the standard (i.e. non-RCU) hlist
code, which is undesirable for a number of reasons:

  1. It gives the misleading impression that the non-RCU hlist code is
     in some way lock-free (despite the notable absence of any memory
     barriers!) and silences KCSAN in such cases.

  2. It unnecessarily penalises code generation for non-RCU hlist users

  3. It makes it difficult to introduce list integrity checks because
     of the possibility of concurrent callers.

Retain the {READ,WRITE}_ONCE() invocations for the RCU hlist code, but
remove them from the non-RCU implementation. Remove the unused
'hlist_unhashed_lockless()' function entirely and add the READ_ONCE()
to hlist_unhashed(), as we do already for hlist_empty() already.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/list.h | 33 ++++++++++-----------------------
 1 file changed, 10 insertions(+), 23 deletions(-)

diff --git a/include/linux/list.h b/include/linux/list.h
index 884216db3246..4fed5a0f9b77 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -777,19 +777,6 @@ static inline void INIT_HLIST_NODE(struct hlist_node *h)
  * node in unhashed state, but hlist_nulls_del() does not.
  */
 static inline int hlist_unhashed(const struct hlist_node *h)
-{
-	return !h->pprev;
-}
-
-/**
- * hlist_unhashed_lockless - Version of hlist_unhashed for lockless use
- * @h: Node to be checked
- *
- * This variant of hlist_unhashed() must be used in lockless contexts
- * to avoid potential load-tearing.  The READ_ONCE() is paired with the
- * various WRITE_ONCE() in hlist helpers that are defined below.
- */
-static inline int hlist_unhashed_lockless(const struct hlist_node *h)
 {
 	return !READ_ONCE(h->pprev);
 }
@@ -852,11 +839,11 @@ static inline void hlist_del_init(struct hlist_node *n)
 static inline void hlist_add_head(struct hlist_node *n, struct hlist_head *h)
 {
 	struct hlist_node *first = h->first;
-	WRITE_ONCE(n->next, first);
+	n->next = first;
 	if (first)
-		WRITE_ONCE(first->pprev, &n->next);
+		first->pprev = &n->next;
 	WRITE_ONCE(h->first, n);
-	WRITE_ONCE(n->pprev, &h->first);
+	n->pprev = &h->first;
 }
 
 /**
@@ -867,9 +854,9 @@ static inline void hlist_add_head(struct hlist_node *n, struct hlist_head *h)
 static inline void hlist_add_before(struct hlist_node *n,
 				    struct hlist_node *next)
 {
-	WRITE_ONCE(n->pprev, next->pprev);
-	WRITE_ONCE(n->next, next);
-	WRITE_ONCE(next->pprev, &n->next);
+	n->pprev = next->pprev;
+	n->next = next;
+	next->pprev = &n->next;
 	WRITE_ONCE(*(n->pprev), n);
 }
 
@@ -881,12 +868,12 @@ static inline void hlist_add_before(struct hlist_node *n,
 static inline void hlist_add_behind(struct hlist_node *n,
 				    struct hlist_node *prev)
 {
-	WRITE_ONCE(n->next, prev->next);
-	WRITE_ONCE(prev->next, n);
-	WRITE_ONCE(n->pprev, &prev->next);
+	n->next = prev->next;
+	prev->next = n;
+	n->pprev = &prev->next;
 
 	if (n->next)
-		WRITE_ONCE(n->next->pprev, &n->next);
+		n->next->pprev  = &n->next;
 }
 
 /**
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 02/21] list: Remove hlist_nulls_unhashed_lockless()
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
  2020-03-24 15:36 ` [RFC PATCH 01/21] list: Remove hlist_unhashed_lockless() Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-24 16:27   ` Greg KH
  2020-03-30 23:07   ` Paul E. McKenney
  2020-03-24 15:36 ` [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race() Will Deacon
                   ` (18 subsequent siblings)
  20 siblings, 2 replies; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

Commit 02b99b38f3d9 ("rcu: Add a hlist_nulls_unhashed_lockless() function")
introduced the (as yet unused) hlist_nulls_unhashed_lockless() function
to avoid KCSAN reports when an RCU reader checks the 'hashed' status
of an 'hlist_nulls' concurrently undergoing modification.

Remove the unused function and add a READ_ONCE() to hlist_nulls_unhashed(),
just like we do already for hlist_nulls_empty().

Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/list_nulls.h | 14 --------------
 1 file changed, 14 deletions(-)

diff --git a/include/linux/list_nulls.h b/include/linux/list_nulls.h
index fa6e8471bd22..3a9ff01e9a11 100644
--- a/include/linux/list_nulls.h
+++ b/include/linux/list_nulls.h
@@ -65,20 +65,6 @@ static inline unsigned long get_nulls_value(const struct hlist_nulls_node *ptr)
  * but hlist_nulls_del() does not.
  */
 static inline int hlist_nulls_unhashed(const struct hlist_nulls_node *h)
-{
-	return !h->pprev;
-}
-
-/**
- * hlist_nulls_unhashed_lockless - Has node been removed and reinitialized?
- * @h: Node to be checked
- *
- * Not that not all removal functions will leave a node in unhashed state.
- * For example, hlist_del_init_rcu() leaves the node in unhashed state,
- * but hlist_nulls_del() does not.  Unlike hlist_nulls_unhashed(), this
- * function may be used locklessly.
- */
-static inline int hlist_nulls_unhashed_lockless(const struct hlist_nulls_node *h)
 {
 	return !READ_ONCE(h->pprev);
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race()
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
  2020-03-24 15:36 ` [RFC PATCH 01/21] list: Remove hlist_unhashed_lockless() Will Deacon
  2020-03-24 15:36 ` [RFC PATCH 02/21] list: Remove hlist_nulls_unhashed_lockless() Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-24 16:20   ` Jann Horn
                     ` (2 more replies)
  2020-03-24 15:36 ` [RFC PATCH 04/21] timers: Use hlist_unhashed() instead of open-coding in timer_pending() Will Deacon
                   ` (17 subsequent siblings)
  20 siblings, 3 replies; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

Some list predicates can be used locklessly even with the non-RCU list
implementations, since they effectively boil down to a test against
NULL. For example, checking whether or not a list is empty is safe even
in the presence of a concurrent, tearing write to the list head pointer.
Similarly, checking whether or not an hlist node has been hashed is safe
as well.

Annotate these lockless list predicates with data_race() and READ_ONCE()
so that KCSAN and the compiler are aware of what's going on. The writer
side can then avoid having to use WRITE_ONCE() in the non-RCU
implementation.

Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/list.h       | 10 +++++-----
 include/linux/list_bl.h    |  5 +++--
 include/linux/list_nulls.h |  6 +++---
 include/linux/llist.h      |  2 +-
 4 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/include/linux/list.h b/include/linux/list.h
index 4fed5a0f9b77..4d9f5f9ed1a8 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -279,7 +279,7 @@ static inline int list_is_last(const struct list_head *list,
  */
 static inline int list_empty(const struct list_head *head)
 {
-	return READ_ONCE(head->next) == head;
+	return data_race(READ_ONCE(head->next) == head);
 }
 
 /**
@@ -524,7 +524,7 @@ static inline void list_splice_tail_init(struct list_head *list,
  */
 #define list_first_entry_or_null(ptr, type, member) ({ \
 	struct list_head *head__ = (ptr); \
-	struct list_head *pos__ = READ_ONCE(head__->next); \
+	struct list_head *pos__ = data_race(READ_ONCE(head__->next)); \
 	pos__ != head__ ? list_entry(pos__, type, member) : NULL; \
 })
 
@@ -772,13 +772,13 @@ static inline void INIT_HLIST_NODE(struct hlist_node *h)
  * hlist_unhashed - Has node been removed from list and reinitialized?
  * @h: Node to be checked
  *
- * Not that not all removal functions will leave a node in unhashed
+ * Note that not all removal functions will leave a node in unhashed
  * state.  For example, hlist_nulls_del_init_rcu() does leave the
  * node in unhashed state, but hlist_nulls_del() does not.
  */
 static inline int hlist_unhashed(const struct hlist_node *h)
 {
-	return !READ_ONCE(h->pprev);
+	return data_race(!READ_ONCE(h->pprev));
 }
 
 /**
@@ -787,7 +787,7 @@ static inline int hlist_unhashed(const struct hlist_node *h)
  */
 static inline int hlist_empty(const struct hlist_head *h)
 {
-	return !READ_ONCE(h->first);
+	return data_race(!READ_ONCE(h->first));
 }
 
 static inline void __hlist_del(struct hlist_node *n)
diff --git a/include/linux/list_bl.h b/include/linux/list_bl.h
index ae1b541446c9..1804fdb84dda 100644
--- a/include/linux/list_bl.h
+++ b/include/linux/list_bl.h
@@ -51,7 +51,7 @@ static inline void INIT_HLIST_BL_NODE(struct hlist_bl_node *h)
 
 static inline bool  hlist_bl_unhashed(const struct hlist_bl_node *h)
 {
-	return !h->pprev;
+	return data_race(!READ_ONCE(h->pprev));
 }
 
 static inline struct hlist_bl_node *hlist_bl_first(struct hlist_bl_head *h)
@@ -71,7 +71,8 @@ static inline void hlist_bl_set_first(struct hlist_bl_head *h,
 
 static inline bool hlist_bl_empty(const struct hlist_bl_head *h)
 {
-	return !((unsigned long)READ_ONCE(h->first) & ~LIST_BL_LOCKMASK);
+	unsigned long first = data_race((unsigned long)READ_ONCE(h->first));
+	return !(first & ~LIST_BL_LOCKMASK);
 }
 
 static inline void hlist_bl_add_head(struct hlist_bl_node *n,
diff --git a/include/linux/list_nulls.h b/include/linux/list_nulls.h
index 3a9ff01e9a11..fa51a801bf32 100644
--- a/include/linux/list_nulls.h
+++ b/include/linux/list_nulls.h
@@ -60,18 +60,18 @@ static inline unsigned long get_nulls_value(const struct hlist_nulls_node *ptr)
  * hlist_nulls_unhashed - Has node been removed and reinitialized?
  * @h: Node to be checked
  *
- * Not that not all removal functions will leave a node in unhashed state.
+ * Note that not all removal functions will leave a node in unhashed state.
  * For example, hlist_del_init_rcu() leaves the node in unhashed state,
  * but hlist_nulls_del() does not.
  */
 static inline int hlist_nulls_unhashed(const struct hlist_nulls_node *h)
 {
-	return !READ_ONCE(h->pprev);
+	return data_race(!READ_ONCE(h->pprev));
 }
 
 static inline int hlist_nulls_empty(const struct hlist_nulls_head *h)
 {
-	return is_a_nulls(READ_ONCE(h->first));
+	return data_race(is_a_nulls(READ_ONCE(h->first)));
 }
 
 static inline void hlist_nulls_add_head(struct hlist_nulls_node *n,
diff --git a/include/linux/llist.h b/include/linux/llist.h
index 2e9c7215882b..c7f042b73899 100644
--- a/include/linux/llist.h
+++ b/include/linux/llist.h
@@ -186,7 +186,7 @@ static inline void init_llist_head(struct llist_head *list)
  */
 static inline bool llist_empty(const struct llist_head *head)
 {
-	return READ_ONCE(head->first) == NULL;
+	return data_race(READ_ONCE(head->first) == NULL);
 }
 
 static inline struct llist_node *llist_next(struct llist_node *node)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 04/21] timers: Use hlist_unhashed() instead of open-coding in timer_pending()
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
                   ` (2 preceding siblings ...)
  2020-03-24 15:36 ` [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race() Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-24 16:30   ` Greg KH
  2020-03-24 15:36 ` [RFC PATCH 05/21] list: Comment missing WRITE_ONCE() in __list_del() Will Deacon
                   ` (16 subsequent siblings)
  20 siblings, 1 reply; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

timer_pending() open-codes a version of hlist_unhashed() to check
whether or not the 'timer' parameter has been queued in the timer wheel.
KCSAN detects this as a racy operation and explodes at us:

  | BUG: KCSAN: data-race in del_timer / detach_if_pending
  |
  | write to 0xffff88808697d870 of 8 bytes by task 10 on cpu 0:
  |  __hlist_del include/linux/list.h:764 [inline]
  |  detach_timer kernel/time/timer.c:815 [inline]
  |  detach_if_pending+0xcd/0x2d0 kernel/time/timer.c:832
  |  try_to_del_timer_sync+0x60/0xb0 kernel/time/timer.c:1226
  |  del_timer_sync+0x6b/0xa0 kernel/time/timer.c:1365
  |  schedule_timeout+0x2d2/0x6e0 kernel/time/timer.c:1896
  |  rcu_gp_fqs_loop+0x37c/0x580 kernel/rcu/tree.c:1639
  |  rcu_gp_kthread+0x143/0x230 kernel/rcu/tree.c:1799
  |  kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
  |  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
  |
  | read to 0xffff88808697d870 of 8 bytes by task 12060 on cpu 1:
  |  del_timer+0x3b/0xb0 kernel/time/timer.c:1198
  |  sk_stop_timer+0x25/0x60 net/core/sock.c:2845
  |  inet_csk_clear_xmit_timers+0x69/0xa0 net/ipv4/inet_connection_sock.c:523
  |  tcp_clear_xmit_timers include/net/tcp.h:606 [inline]
  |  tcp_v4_destroy_sock+0xa3/0x3f0 net/ipv4/tcp_ipv4.c:2096
  |  inet_csk_destroy_sock+0xf4/0x250 net/ipv4/inet_connection_sock.c:836
  |  tcp_close+0x6f3/0x970 net/ipv4/tcp.c:2497
  |  inet_release+0x86/0x100 net/ipv4/af_inet.c:427
  |  __sock_release+0x85/0x160 net/socket.c:590
  |  sock_close+0x24/0x30 net/socket.c:1268
  |  __fput+0x1e1/0x520 fs/file_table.c:280
  |  ____fput+0x1f/0x30 fs/file_table.c:313
  |  task_work_run+0xf6/0x130 kernel/task_work.c:113
  |  tracehook_notify_resume include/linux/tracehook.h:188 [inline]
  |  exit_to_usermode_loop+0x2b4/0x2c0 arch/x86/entry/common.c:163

Replace the explicit 'pprev' pointer comparison in timer_pending() with
a call to hlist_unhashed() and initialise the 'expires' timer field
explicitly in do_init_timer() so that the compiler doesn't emit bogus
'maybe used uninitialised' warnings now that it cannot reason statically
about the result of timer_pending().

Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/timer.h | 5 +++--
 kernel/time/timer.c   | 1 +
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/timer.h b/include/linux/timer.h
index 1e6650ed066d..e9610d2988ba 100644
--- a/include/linux/timer.h
+++ b/include/linux/timer.h
@@ -158,13 +158,14 @@ static inline void destroy_timer_on_stack(struct timer_list *timer) { }
  *
  * timer_pending will tell whether a given timer is currently pending,
  * or not. Callers must ensure serialization wrt. other operations done
- * to this timer, eg. interrupt contexts, or other CPUs on SMP.
+ * to this timer, eg. interrupt contexts, or other CPUs on SMP if they
+ * cannot tolerate spurious results.
  *
  * return value: 1 if the timer is pending, 0 if not.
  */
 static inline int timer_pending(const struct timer_list * timer)
 {
-	return timer->entry.pprev != NULL;
+	return !hlist_unhashed(&timer->entry);
 }
 
 extern void add_timer_on(struct timer_list *timer, int cpu);
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 4820823515e9..9e1c6fc8433a 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -780,6 +780,7 @@ static void do_init_timer(struct timer_list *timer,
 			  const char *name, struct lock_class_key *key)
 {
 	timer->entry.pprev = NULL;
+	timer->expires = 0; /* Avoid bogus 'maybe used uninitialized' warning */
 	timer->function = func;
 	timer->flags = flags | raw_smp_processor_id();
 	lockdep_init_map(&timer->lockdep_map, name, key, 0);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 05/21] list: Comment missing WRITE_ONCE() in __list_del()
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
                   ` (3 preceding siblings ...)
  2020-03-24 15:36 ` [RFC PATCH 04/21] timers: Use hlist_unhashed() instead of open-coding in timer_pending() Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-30 23:14   ` Paul E. McKenney
  2020-03-24 15:36 ` [RFC PATCH 06/21] list: Remove superfluous WRITE_ONCE() from hlist_nulls implementation Will Deacon
                   ` (15 subsequent siblings)
  20 siblings, 1 reply; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

Although __list_del() is called from the RCU list implementation, it
omits WRITE_ONCE() when updating the 'prev' pointer for the 'next' node.
This is reasonable because concurrent RCU readers only access the 'next'
pointers.

Although it's obvious after reading the code, add a comment.

Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/list.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/list.h b/include/linux/list.h
index 4d9f5f9ed1a8..ec1f780d1449 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -109,6 +109,7 @@ static inline void list_add_tail(struct list_head *new, struct list_head *head)
  */
 static inline void __list_del(struct list_head * prev, struct list_head * next)
 {
+	/* RCU readers don't read the 'prev' pointer */
 	next->prev = prev;
 	WRITE_ONCE(prev->next, next);
 }
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 06/21] list: Remove superfluous WRITE_ONCE() from hlist_nulls implementation
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
                   ` (4 preceding siblings ...)
  2020-03-24 15:36 ` [RFC PATCH 05/21] list: Comment missing WRITE_ONCE() in __list_del() Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-30 23:21   ` Paul E. McKenney
  2020-03-24 15:36 ` [RFC PATCH 07/21] Revert "list: Use WRITE_ONCE() when adding to lists and hlists" Will Deacon
                   ` (14 subsequent siblings)
  20 siblings, 1 reply; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

Commit 860c8802ace1 ("rcu: Use WRITE_ONCE() for assignments to ->pprev
for hlist_nulls") added WRITE_ONCE() invocations to hlist_nulls_add_head()
and hlist_nulls_del().

Since these functions should not ordinarily run concurrently with other
list accessors, restore the plain C assignments so that KCSAN can yell
if a data race occurs.

Cc: Eric Dumazet <edumazet@google.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/list_nulls.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/list_nulls.h b/include/linux/list_nulls.h
index fa51a801bf32..fd154ceb5b0d 100644
--- a/include/linux/list_nulls.h
+++ b/include/linux/list_nulls.h
@@ -80,10 +80,10 @@ static inline void hlist_nulls_add_head(struct hlist_nulls_node *n,
 	struct hlist_nulls_node *first = h->first;
 
 	n->next = first;
-	WRITE_ONCE(n->pprev, &h->first);
+	n->pprev = &h->first;
 	h->first = n;
 	if (!is_a_nulls(first))
-		WRITE_ONCE(first->pprev, &n->next);
+		first->pprev = &n->next;
 }
 
 static inline void __hlist_nulls_del(struct hlist_nulls_node *n)
@@ -99,7 +99,7 @@ static inline void __hlist_nulls_del(struct hlist_nulls_node *n)
 static inline void hlist_nulls_del(struct hlist_nulls_node *n)
 {
 	__hlist_nulls_del(n);
-	WRITE_ONCE(n->pprev, LIST_POISON2);
+	n->pprev = LIST_POISON2;
 }
 
 /**
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 07/21] Revert "list: Use WRITE_ONCE() when adding to lists and hlists"
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
                   ` (5 preceding siblings ...)
  2020-03-24 15:36 ` [RFC PATCH 06/21] list: Remove superfluous WRITE_ONCE() from hlist_nulls implementation Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-30 23:19   ` Paul E. McKenney
  2020-03-24 15:36 ` [RFC PATCH 08/21] Revert "list: Use WRITE_ONCE() when initializing list_head structures" Will Deacon
                   ` (13 subsequent siblings)
  20 siblings, 1 reply; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

This reverts commit 1c97be677f72b3c338312aecd36d8fff20322f32.

There is no need to use WRITE_ONCE() for the non-RCU list and hlist
implementations.

Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/list.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/list.h b/include/linux/list.h
index ec1f780d1449..c7331c259792 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -70,7 +70,7 @@ static inline void __list_add(struct list_head *new,
 	next->prev = new;
 	new->next = next;
 	new->prev = prev;
-	WRITE_ONCE(prev->next, new);
+	prev->next = new;
 }
 
 /**
@@ -843,7 +843,7 @@ static inline void hlist_add_head(struct hlist_node *n, struct hlist_head *h)
 	n->next = first;
 	if (first)
 		first->pprev = &n->next;
-	WRITE_ONCE(h->first, n);
+	h->first = n;
 	n->pprev = &h->first;
 }
 
@@ -858,7 +858,7 @@ static inline void hlist_add_before(struct hlist_node *n,
 	n->pprev = next->pprev;
 	n->next = next;
 	next->pprev = &n->next;
-	WRITE_ONCE(*(n->pprev), n);
+	*(n->pprev) = n;
 }
 
 /**
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 08/21] Revert "list: Use WRITE_ONCE() when initializing list_head structures"
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
                   ` (6 preceding siblings ...)
  2020-03-24 15:36 ` [RFC PATCH 07/21] Revert "list: Use WRITE_ONCE() when adding to lists and hlists" Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-30 23:25   ` Paul E. McKenney
  2020-03-24 15:36 ` [RFC PATCH 09/21] list: Remove unnecessary WRITE_ONCE() from hlist_bl_add_before() Will Deacon
                   ` (12 subsequent siblings)
  20 siblings, 1 reply; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

This reverts commit 2f073848c3cc8aff2655ab7c46d8c0de90cf4e50.

There is no need to use WRITE_ONCE() to initialise a non-RCU 'list_head'.

Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/list.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/list.h b/include/linux/list.h
index c7331c259792..b86a3f9465d4 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -32,7 +32,7 @@
  */
 static inline void INIT_LIST_HEAD(struct list_head *list)
 {
-	WRITE_ONCE(list->next, list);
+	list->next = list;
 	list->prev = list;
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 09/21] list: Remove unnecessary WRITE_ONCE() from hlist_bl_add_before()
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
                   ` (7 preceding siblings ...)
  2020-03-24 15:36 ` [RFC PATCH 08/21] Revert "list: Use WRITE_ONCE() when initializing list_head structures" Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-30 23:30   ` Paul E. McKenney
  2020-03-24 15:36 ` [RFC PATCH 10/21] kernel-hacking: Make DEBUG_{LIST,PLIST,SG,NOTIFIERS} non-debug options Will Deacon
                   ` (11 subsequent siblings)
  20 siblings, 1 reply; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

There is no need to use WRITE_ONCE() when inserting into a non-RCU
protected list.

Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/list_bl.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/list_bl.h b/include/linux/list_bl.h
index 1804fdb84dda..c93e7e3aa8fc 100644
--- a/include/linux/list_bl.h
+++ b/include/linux/list_bl.h
@@ -91,15 +91,15 @@ static inline void hlist_bl_add_before(struct hlist_bl_node *n,
 				       struct hlist_bl_node *next)
 {
 	struct hlist_bl_node **pprev = next->pprev;
+	unsigned long val;
 
 	n->pprev = pprev;
 	n->next = next;
 	next->pprev = &n->next;
 
 	/* pprev may be `first`, so be careful not to lose the lock bit */
-	WRITE_ONCE(*pprev,
-		   (struct hlist_bl_node *)
-			((uintptr_t)n | ((uintptr_t)*pprev & LIST_BL_LOCKMASK)));
+	val = (unsigned long)n | ((unsigned long)*pprev & LIST_BL_LOCKMASK);
+	*pprev = (struct hlist_bl_node *)val;
 }
 
 static inline void hlist_bl_add_behind(struct hlist_bl_node *n,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 10/21] kernel-hacking: Make DEBUG_{LIST,PLIST,SG,NOTIFIERS} non-debug options
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
                   ` (8 preceding siblings ...)
  2020-03-24 15:36 ` [RFC PATCH 09/21] list: Remove unnecessary WRITE_ONCE() from hlist_bl_add_before() Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-24 16:42   ` Greg KH
  2020-03-24 15:36 ` [RFC PATCH 11/21] list: Add integrity checking to hlist implementation Will Deacon
                   ` (10 subsequent siblings)
  20 siblings, 1 reply; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

The CONFIG_DEBUG_{LIST,PLIST,SG,NOTIFIERS} options can provide useful
security hardening properties outside of debug scenarios. For example,
CVE-2019-2215 and CVE-2019-2025 are mitigated with negligible runtime
overhead by enabling CONFIG_DEBUG_LIST, and this option is already
enabled by default on many distributions:

https://googleprojectzero.blogspot.com/2019/11/bad-binder-android-in-wild-exploit.html

Rename these options across the tree so that the naming better reflects
their operation and remove the dependency on DEBUG_KERNEL.

Cc: Maddie Stone <maddiestone@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm/configs/tegra_defconfig              |  2 +-
 arch/mips/configs/bigsur_defconfig            |  2 +-
 arch/powerpc/configs/ppc6xx_defconfig         |  4 ++--
 arch/powerpc/configs/ps3_defconfig            |  2 +-
 arch/powerpc/configs/skiroot_defconfig        |  4 ++--
 arch/riscv/configs/defconfig                  |  6 ++---
 arch/riscv/configs/rv32_defconfig             |  6 ++---
 arch/s390/configs/debug_defconfig             |  4 ++--
 arch/sh/configs/polaris_defconfig             |  2 +-
 arch/sh/configs/rsk7203_defconfig             |  4 ++--
 arch/sh/configs/se7206_defconfig              |  2 +-
 drivers/gpu/drm/radeon/mkregtable.c           |  2 +-
 drivers/staging/wfx/fwio.c                    |  2 +-
 drivers/staging/wfx/hwio.c                    |  2 +-
 include/linux/list.h                          |  2 +-
 include/linux/list_bl.h                       |  2 +-
 include/linux/plist.h                         |  4 ++--
 include/linux/scatterlist.h                   |  6 ++---
 kernel/notifier.c                             |  2 +-
 lib/Kconfig.debug                             | 24 ++++++++-----------
 lib/Makefile                                  |  2 +-
 lib/list_debug.c                              |  2 +-
 lib/plist.c                                   |  4 ++--
 tools/include/linux/list.h                    |  4 ++--
 .../selftests/wireguard/qemu/debug.config     |  6 ++---
 tools/virtio/linux/scatterlist.h              |  4 ++--
 26 files changed, 51 insertions(+), 55 deletions(-)

diff --git a/arch/arm/configs/tegra_defconfig b/arch/arm/configs/tegra_defconfig
index a27592d3b1fa..a7cadc9407e0 100644
--- a/arch/arm/configs/tegra_defconfig
+++ b/arch/arm/configs/tegra_defconfig
@@ -299,6 +299,6 @@ CONFIG_DETECT_HUNG_TASK=y
 CONFIG_SCHEDSTATS=y
 # CONFIG_DEBUG_PREEMPT is not set
 CONFIG_DEBUG_MUTEXES=y
-CONFIG_DEBUG_SG=y
+CONFIG_CHECK_INTEGRITY_SG=y
 CONFIG_DEBUG_LL=y
 CONFIG_EARLY_PRINTK=y
diff --git a/arch/mips/configs/bigsur_defconfig b/arch/mips/configs/bigsur_defconfig
index f14ad0538f4e..61fff4050486 100644
--- a/arch/mips/configs/bigsur_defconfig
+++ b/arch/mips/configs/bigsur_defconfig
@@ -259,4 +259,4 @@ CONFIG_CRC7=m
 CONFIG_MAGIC_SYSRQ=y
 CONFIG_DEBUG_MEMORY_INIT=y
 CONFIG_DETECT_HUNG_TASK=y
-CONFIG_DEBUG_LIST=y
+CONFIG_CHECK_INTEGRITY_LIST=y
diff --git a/arch/powerpc/configs/ppc6xx_defconfig b/arch/powerpc/configs/ppc6xx_defconfig
index 3e2f44f38ac5..6d46ff7dc1bc 100644
--- a/arch/powerpc/configs/ppc6xx_defconfig
+++ b/arch/powerpc/configs/ppc6xx_defconfig
@@ -1133,8 +1133,8 @@ CONFIG_DEBUG_SHIRQ=y
 CONFIG_DEBUG_RT_MUTEXES=y
 CONFIG_DEBUG_SPINLOCK=y
 CONFIG_DEBUG_MUTEXES=y
-CONFIG_DEBUG_LIST=y
-CONFIG_DEBUG_SG=y
+CONFIG_CHECK_INTEGRITY_LIST=y
+CONFIG_CHECK_INTEGRITY_SG=y
 CONFIG_FAULT_INJECTION=y
 CONFIG_FAILSLAB=y
 CONFIG_FAIL_PAGE_ALLOC=y
diff --git a/arch/powerpc/configs/ps3_defconfig b/arch/powerpc/configs/ps3_defconfig
index 4db51719342a..12d0c10914ed 100644
--- a/arch/powerpc/configs/ps3_defconfig
+++ b/arch/powerpc/configs/ps3_defconfig
@@ -160,7 +160,7 @@ CONFIG_DEBUG_STACKOVERFLOW=y
 CONFIG_DETECT_HUNG_TASK=y
 CONFIG_PROVE_LOCKING=y
 CONFIG_DEBUG_LOCKDEP=y
-CONFIG_DEBUG_LIST=y
+CONFIG_CHECK_INTEGRITY_LIST=y
 CONFIG_RCU_CPU_STALL_TIMEOUT=60
 # CONFIG_FTRACE is not set
 CONFIG_CRYPTO_PCBC=m
diff --git a/arch/powerpc/configs/skiroot_defconfig b/arch/powerpc/configs/skiroot_defconfig
index 1b6bdad36b13..8ab17a33a1af 100644
--- a/arch/powerpc/configs/skiroot_defconfig
+++ b/arch/powerpc/configs/skiroot_defconfig
@@ -306,8 +306,8 @@ CONFIG_HARDLOCKUP_DETECTOR=y
 CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
 CONFIG_WQ_WATCHDOG=y
 # CONFIG_SCHED_DEBUG is not set
-CONFIG_DEBUG_SG=y
-CONFIG_DEBUG_NOTIFIERS=y
+CONFIG_CHECK_INTEGRITY_SG=y
+CONFIG_CHECK_INTEGRITY_NOTIFIERS=y
 CONFIG_BUG_ON_DATA_CORRUPTION=y
 CONFIG_DEBUG_CREDENTIALS=y
 # CONFIG_FTRACE is not set
diff --git a/arch/riscv/configs/defconfig b/arch/riscv/configs/defconfig
index c8f084203067..dfa157c59822 100644
--- a/arch/riscv/configs/defconfig
+++ b/arch/riscv/configs/defconfig
@@ -105,9 +105,9 @@ CONFIG_DEBUG_MUTEXES=y
 CONFIG_DEBUG_RWSEMS=y
 CONFIG_DEBUG_ATOMIC_SLEEP=y
 CONFIG_STACKTRACE=y
-CONFIG_DEBUG_LIST=y
-CONFIG_DEBUG_PLIST=y
-CONFIG_DEBUG_SG=y
+CONFIG_CHECK_INTEGRITY_LIST=y
+CONFIG_CHECK_INTEGRITY_PLIST=y
+CONFIG_CHECK_INTEGRITY_SG=y
 # CONFIG_RCU_TRACE is not set
 CONFIG_RCU_EQS_DEBUG=y
 CONFIG_DEBUG_BLOCK_EXT_DEVT=y
diff --git a/arch/riscv/configs/rv32_defconfig b/arch/riscv/configs/rv32_defconfig
index a844920a261f..1893bfaf023e 100644
--- a/arch/riscv/configs/rv32_defconfig
+++ b/arch/riscv/configs/rv32_defconfig
@@ -101,9 +101,9 @@ CONFIG_DEBUG_MUTEXES=y
 CONFIG_DEBUG_RWSEMS=y
 CONFIG_DEBUG_ATOMIC_SLEEP=y
 CONFIG_STACKTRACE=y
-CONFIG_DEBUG_LIST=y
-CONFIG_DEBUG_PLIST=y
-CONFIG_DEBUG_SG=y
+CONFIG_CHECK_INTEGRITY_LIST=y
+CONFIG_CHECK_INTEGRITY_PLIST=y
+CONFIG_CHECK_INTEGRITY_SG=y
 # CONFIG_RCU_TRACE is not set
 CONFIG_RCU_EQS_DEBUG=y
 CONFIG_DEBUG_BLOCK_EXT_DEVT=y
diff --git a/arch/s390/configs/debug_defconfig b/arch/s390/configs/debug_defconfig
index 0c86ba19fa2b..438eb0ce4d64 100644
--- a/arch/s390/configs/debug_defconfig
+++ b/arch/s390/configs/debug_defconfig
@@ -779,8 +779,8 @@ CONFIG_LOCK_STAT=y
 CONFIG_DEBUG_LOCKDEP=y
 CONFIG_DEBUG_ATOMIC_SLEEP=y
 CONFIG_DEBUG_LOCKING_API_SELFTESTS=y
-CONFIG_DEBUG_SG=y
-CONFIG_DEBUG_NOTIFIERS=y
+CONFIG_CHECK_INTEGRITY_SG=y
+CONFIG_CHECK_INTEGRITY_NOTIFIERS=y
 CONFIG_BUG_ON_DATA_CORRUPTION=y
 CONFIG_DEBUG_CREDENTIALS=y
 CONFIG_RCU_TORTURE_TEST=m
diff --git a/arch/sh/configs/polaris_defconfig b/arch/sh/configs/polaris_defconfig
index e3a1d3d2694a..699610cb4333 100644
--- a/arch/sh/configs/polaris_defconfig
+++ b/arch/sh/configs/polaris_defconfig
@@ -81,4 +81,4 @@ CONFIG_DEBUG_RT_MUTEXES=y
 CONFIG_DEBUG_LOCK_ALLOC=y
 CONFIG_DEBUG_SPINLOCK_SLEEP=y
 CONFIG_DEBUG_INFO=y
-CONFIG_DEBUG_SG=y
+CONFIG_CHECK_INTEGRITY_SG=y
diff --git a/arch/sh/configs/rsk7203_defconfig b/arch/sh/configs/rsk7203_defconfig
index 10a32bd4cf66..74df7ab6a3e2 100644
--- a/arch/sh/configs/rsk7203_defconfig
+++ b/arch/sh/configs/rsk7203_defconfig
@@ -118,7 +118,7 @@ CONFIG_DEBUG_MUTEXES=y
 CONFIG_DEBUG_SPINLOCK_SLEEP=y
 CONFIG_DEBUG_INFO=y
 CONFIG_DEBUG_VM=y
-CONFIG_DEBUG_LIST=y
-CONFIG_DEBUG_SG=y
+CONFIG_CHECK_INTEGRITY_LIST=y
+CONFIG_CHECK_INTEGRITY_SG=y
 CONFIG_FRAME_POINTER=y
 CONFIG_DEBUG_STACK_USAGE=y
diff --git a/arch/sh/configs/se7206_defconfig b/arch/sh/configs/se7206_defconfig
index a93402b3a319..612cba5e1fe3 100644
--- a/arch/sh/configs/se7206_defconfig
+++ b/arch/sh/configs/se7206_defconfig
@@ -99,7 +99,7 @@ CONFIG_DEBUG_KERNEL=y
 CONFIG_DETECT_HUNG_TASK=y
 CONFIG_DEBUG_SPINLOCK_SLEEP=y
 CONFIG_DEBUG_VM=y
-CONFIG_DEBUG_LIST=y
+CONFIG_CHECK_INTEGRITY_LIST=y
 CONFIG_FRAME_POINTER=y
 CONFIG_DEBUG_STACK_USAGE=y
 CONFIG_CRYPTO_DEFLATE=y
diff --git a/drivers/gpu/drm/radeon/mkregtable.c b/drivers/gpu/drm/radeon/mkregtable.c
index 52a7246fed9e..2765c9649056 100644
--- a/drivers/gpu/drm/radeon/mkregtable.c
+++ b/drivers/gpu/drm/radeon/mkregtable.c
@@ -56,7 +56,7 @@ static inline void INIT_LIST_HEAD(struct list_head *list)
  * This is only for internal list manipulation where we know
  * the prev/next entries already!
  */
-#ifndef CONFIG_DEBUG_LIST
+#ifndef CONFIG_CHECK_INTEGRITY_LIST
 static inline void __list_add(struct list_head *new,
 			      struct list_head *prev, struct list_head *next)
 {
diff --git a/drivers/staging/wfx/fwio.c b/drivers/staging/wfx/fwio.c
index 9d61082c1e6c..e10026ae7b13 100644
--- a/drivers/staging/wfx/fwio.c
+++ b/drivers/staging/wfx/fwio.c
@@ -74,7 +74,7 @@ static const char * const fwio_errors[] = {
  * underlying hardware that use DMA. Function below detect this case and
  * allocate a bounce buffer if necessary.
  *
- * Notice that, in doubt, you can enable CONFIG_DEBUG_SG to ask kernel to
+ * Notice that, in doubt, you can enable CONFIG_CHECK_INTEGRITY_SG to ask kernel to
  * detect this problem at runtime  (else, kernel silently fail).
  *
  * NOTE: it may also be possible to use 'pages' from struct firmware and avoid
diff --git a/drivers/staging/wfx/hwio.c b/drivers/staging/wfx/hwio.c
index 47e04c59ed93..3524b1b38d8e 100644
--- a/drivers/staging/wfx/hwio.c
+++ b/drivers/staging/wfx/hwio.c
@@ -22,7 +22,7 @@
  * allocated data. Functions below that work with registers (aka functions
  * ending with "32") automatically reallocate buffers with kmalloc. However,
  * functions that work with arbitrary length buffers let's caller to handle
- * memory location. In doubt, enable CONFIG_DEBUG_SG to detect badly located
+ * memory location. In doubt, enable CONFIG_CHECK_INTEGRITY_SG to detect badly located
  * buffer.
  */
 
diff --git a/include/linux/list.h b/include/linux/list.h
index b86a3f9465d4..2bef081afa69 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -36,7 +36,7 @@ static inline void INIT_LIST_HEAD(struct list_head *list)
 	list->prev = list;
 }
 
-#ifdef CONFIG_DEBUG_LIST
+#ifdef CONFIG_CHECK_INTEGRITY_LIST
 extern bool __list_add_valid(struct list_head *new,
 			      struct list_head *prev,
 			      struct list_head *next);
diff --git a/include/linux/list_bl.h b/include/linux/list_bl.h
index c93e7e3aa8fc..9f8e29142324 100644
--- a/include/linux/list_bl.h
+++ b/include/linux/list_bl.h
@@ -24,7 +24,7 @@
 #define LIST_BL_LOCKMASK	0UL
 #endif
 
-#ifdef CONFIG_DEBUG_LIST
+#ifdef CONFIG_CHECK_INTEGRITY_LIST
 #define LIST_BL_BUG_ON(x) BUG_ON(x)
 #else
 #define LIST_BL_BUG_ON(x)
diff --git a/include/linux/plist.h b/include/linux/plist.h
index 66bab1bca35c..da85b42c9ebf 100644
--- a/include/linux/plist.h
+++ b/include/linux/plist.h
@@ -229,7 +229,7 @@ static inline int plist_node_empty(const struct plist_node *node)
  * @type:	the type of the struct this is embedded in
  * @member:	the name of the list_head within the struct
  */
-#ifdef CONFIG_DEBUG_PLIST
+#ifdef CONFIG_CHECK_INTEGRITY_PLIST
 # define plist_first_entry(head, type, member)	\
 ({ \
 	WARN_ON(plist_head_empty(head)); \
@@ -246,7 +246,7 @@ static inline int plist_node_empty(const struct plist_node *node)
  * @type:	the type of the struct this is embedded in
  * @member:	the name of the list_head within the struct
  */
-#ifdef CONFIG_DEBUG_PLIST
+#ifdef CONFIG_CHECK_INTEGRITY_PLIST
 # define plist_last_entry(head, type, member)	\
 ({ \
 	WARN_ON(plist_head_empty(head)); \
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 6eec50fb36c8..f9dc7e730a68 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -93,7 +93,7 @@ static inline void sg_assign_page(struct scatterlist *sg, struct page *page)
 	 * must be aligned at a 32-bit boundary as a minimum.
 	 */
 	BUG_ON((unsigned long) page & (SG_CHAIN | SG_END));
-#ifdef CONFIG_DEBUG_SG
+#ifdef CONFIG_CHECK_INTEGRITY_SG
 	BUG_ON(sg_is_chain(sg));
 #endif
 	sg->page_link = page_link | (unsigned long) page;
@@ -123,7 +123,7 @@ static inline void sg_set_page(struct scatterlist *sg, struct page *page,
 
 static inline struct page *sg_page(struct scatterlist *sg)
 {
-#ifdef CONFIG_DEBUG_SG
+#ifdef CONFIG_CHECK_INTEGRITY_SG
 	BUG_ON(sg_is_chain(sg));
 #endif
 	return (struct page *)((sg)->page_link & ~(SG_CHAIN | SG_END));
@@ -139,7 +139,7 @@ static inline struct page *sg_page(struct scatterlist *sg)
 static inline void sg_set_buf(struct scatterlist *sg, const void *buf,
 			      unsigned int buflen)
 {
-#ifdef CONFIG_DEBUG_SG
+#ifdef CONFIG_CHECK_INTEGRITY_SG
 	BUG_ON(!virt_addr_valid(buf));
 #endif
 	sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
diff --git a/kernel/notifier.c b/kernel/notifier.c
index 63d7501ac638..b4c799b80227 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -73,7 +73,7 @@ static int notifier_call_chain(struct notifier_block **nl,
 	while (nb && nr_to_call) {
 		next_nb = rcu_dereference_raw(nb->next);
 
-#ifdef CONFIG_DEBUG_NOTIFIERS
+#ifdef CONFIG_CHECK_INTEGRITY_NOTIFIERS
 		if (unlikely(!func_ptr_is_kernel_text(nb->notifier_call))) {
 			WARN(1, "Invalid notifier called!");
 			nb = next_nb;
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 1458505192cd..f05ea01b30a7 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1312,20 +1312,18 @@ config DEBUG_KOBJECT_RELEASE
 config HAVE_DEBUG_BUGVERBOSE
 	bool
 
-menu "Debug kernel data structures"
+menu "Kernel data structure integrity"
 
-config DEBUG_LIST
-	bool "Debug linked list manipulation"
-	depends on DEBUG_KERNEL || BUG_ON_DATA_CORRUPTION
+config CHECK_INTEGRITY_LIST
+	bool "Check integrity of linked list manipulation"
 	help
 	  Enable this to turn on extended checks in the linked-list
 	  walking routines.
 
 	  If unsure, say N.
 
-config DEBUG_PLIST
-	bool "Debug priority linked list manipulation"
-	depends on DEBUG_KERNEL
+config CHECK_INTEGRITY_PLIST
+	bool "Check integrity of priority linked list manipulation"
 	help
 	  Enable this to turn on extended checks in the priority-ordered
 	  linked-list (plist) walking routines.  This checks the entire
@@ -1333,9 +1331,8 @@ config DEBUG_PLIST
 
 	  If unsure, say N.
 
-config DEBUG_SG
-	bool "Debug SG table operations"
-	depends on DEBUG_KERNEL
+config CHECK_INTEGRITY_SG
+	bool "Check integrity of SG table operations"
 	help
 	  Enable this to turn on checks on scatter-gather tables. This can
 	  help find problems with drivers that do not properly initialize
@@ -1343,9 +1340,8 @@ config DEBUG_SG
 
 	  If unsure, say N.
 
-config DEBUG_NOTIFIERS
-	bool "Debug notifier call chains"
-	depends on DEBUG_KERNEL
+config CHECK_INTEGRITY_NOTIFIERS
+	bool "Check integrity of notifier call chains"
 	help
 	  Enable this to turn on sanity checking for notifier call chains.
 	  This is most useful for kernel developers to make sure that
@@ -1355,7 +1351,7 @@ config DEBUG_NOTIFIERS
 
 config BUG_ON_DATA_CORRUPTION
 	bool "Trigger a BUG when data corruption is detected"
-	select DEBUG_LIST
+	select CHECK_INTEGRITY_LIST
 	help
 	  Select this option if the kernel should BUG when it encounters
 	  data corruption in kernel memory structures when they get checked
diff --git a/lib/Makefile b/lib/Makefile
index f19b85c87fda..6a3888dac634 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -121,7 +121,7 @@ obj-$(CONFIG_BTREE) += btree.o
 obj-$(CONFIG_INTERVAL_TREE) += interval_tree.o
 obj-$(CONFIG_ASSOCIATIVE_ARRAY) += assoc_array.o
 obj-$(CONFIG_DEBUG_PREEMPT) += smp_processor_id.o
-obj-$(CONFIG_DEBUG_LIST) += list_debug.o
+obj-$(CONFIG_CHECK_INTEGRITY_LIST) += list_debug.o
 obj-$(CONFIG_DEBUG_OBJECTS) += debugobjects.o
 
 obj-$(CONFIG_BITREVERSE) += bitrev.o
diff --git a/lib/list_debug.c b/lib/list_debug.c
index 5d5424b51b74..57bf685af2ef 100644
--- a/lib/list_debug.c
+++ b/lib/list_debug.c
@@ -2,7 +2,7 @@
  * Copyright 2006, Red Hat, Inc., Dave Jones
  * Released under the General Public License (GPL).
  *
- * This file contains the linked list validation for DEBUG_LIST.
+ * This file contains the linked list validation for CHECK_INTEGRITY_LIST.
  */
 
 #include <linux/export.h>
diff --git a/lib/plist.c b/lib/plist.c
index 0d86ed7a76ac..c06e98e78259 100644
--- a/lib/plist.c
+++ b/lib/plist.c
@@ -25,7 +25,7 @@
 #include <linux/bug.h>
 #include <linux/plist.h>
 
-#ifdef CONFIG_DEBUG_PLIST
+#ifdef CONFIG_CHECK_INTEGRITY_PLIST
 
 static struct plist_head test_head;
 
@@ -172,7 +172,7 @@ void plist_requeue(struct plist_node *node, struct plist_head *head)
 	plist_check_head(head);
 }
 
-#ifdef CONFIG_DEBUG_PLIST
+#ifdef CONFIG_CHECK_INTEGRITY_PLIST
 #include <linux/sched.h>
 #include <linux/sched/clock.h>
 #include <linux/module.h>
diff --git a/tools/include/linux/list.h b/tools/include/linux/list.h
index b2fc48d5478c..7a7a7f1cf380 100644
--- a/tools/include/linux/list.h
+++ b/tools/include/linux/list.h
@@ -34,7 +34,7 @@ static inline void INIT_LIST_HEAD(struct list_head *list)
  * This is only for internal list manipulation where we know
  * the prev/next entries already!
  */
-#ifndef CONFIG_DEBUG_LIST
+#ifndef CONFIG_CHECK_INTEGRITY_LIST
 static inline void __list_add(struct list_head *new,
 			      struct list_head *prev,
 			      struct list_head *next)
@@ -96,7 +96,7 @@ static inline void __list_del(struct list_head * prev, struct list_head * next)
  * Note: list_empty() on entry does not return true after this, the entry is
  * in an undefined state.
  */
-#ifndef CONFIG_DEBUG_LIST
+#ifndef CONFIG_CHECK_INTEGRITY_LIST
 static inline void __list_del_entry(struct list_head *entry)
 {
 	__list_del(entry->prev, entry->next);
diff --git a/tools/testing/selftests/wireguard/qemu/debug.config b/tools/testing/selftests/wireguard/qemu/debug.config
index 5909e7ef2a5c..04f8ed0600e1 100644
--- a/tools/testing/selftests/wireguard/qemu/debug.config
+++ b/tools/testing/selftests/wireguard/qemu/debug.config
@@ -48,7 +48,7 @@ CONFIG_LOCKDEP=y
 CONFIG_DEBUG_ATOMIC_SLEEP=y
 CONFIG_TRACE_IRQFLAGS=y
 CONFIG_DEBUG_BUGVERBOSE=y
-CONFIG_DEBUG_LIST=y
+CONFIG_CHECK_INTEGRITY_LIST=y
 CONFIG_DEBUG_PI_LIST=y
 CONFIG_PROVE_RCU=y
 CONFIG_SPARSE_RCU_POINTER=y
@@ -56,8 +56,8 @@ CONFIG_RCU_CPU_STALL_TIMEOUT=21
 CONFIG_RCU_TRACE=y
 CONFIG_RCU_EQS_DEBUG=y
 CONFIG_USER_STACKTRACE_SUPPORT=y
-CONFIG_DEBUG_SG=y
-CONFIG_DEBUG_NOTIFIERS=y
+CONFIG_CHECK_INTEGRITY_SG=y
+CONFIG_CHECK_INTEGRITY_NOTIFIERS=y
 CONFIG_DOUBLEFAULT=y
 CONFIG_X86_DEBUG_FPU=y
 CONFIG_DEBUG_SECTION_MISMATCH=y
diff --git a/tools/virtio/linux/scatterlist.h b/tools/virtio/linux/scatterlist.h
index 369ee308b668..9fe5341b5a4a 100644
--- a/tools/virtio/linux/scatterlist.h
+++ b/tools/virtio/linux/scatterlist.h
@@ -35,7 +35,7 @@ static inline void sg_assign_page(struct scatterlist *sg, struct page *page)
 	 * must be aligned at a 32-bit boundary as a minimum.
 	 */
 	BUG_ON((unsigned long) page & 0x03);
-#ifdef CONFIG_DEBUG_SG
+#ifdef CONFIG_CHECK_INTEGRITY_SG
 	BUG_ON(sg_is_chain(sg));
 #endif
 	sg->page_link = page_link | (unsigned long) page;
@@ -65,7 +65,7 @@ static inline void sg_set_page(struct scatterlist *sg, struct page *page,
 
 static inline struct page *sg_page(struct scatterlist *sg)
 {
-#ifdef CONFIG_DEBUG_SG
+#ifdef CONFIG_CHECK_INTEGRITY_SG
 	BUG_ON(sg_is_chain(sg));
 #endif
 	return (struct page *)((sg)->page_link & ~0x3);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 11/21] list: Add integrity checking to hlist implementation
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
                   ` (9 preceding siblings ...)
  2020-03-24 15:36 ` [RFC PATCH 10/21] kernel-hacking: Make DEBUG_{LIST,PLIST,SG,NOTIFIERS} non-debug options Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-24 15:36 ` [RFC PATCH 12/21] list: Poison ->next pointer for non-RCU deletion of 'hlist_nulls_node' Will Deacon
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

Extend the 'hlist' implementation so that it can optionally perform
integrity checking in a similar fashion to the standard 'list' code
when CONFIG_CHECK_INTEGRITY_LIST=y.

Cc: Kees Cook <keescook@chromium.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/list.h    | 41 ++++++++++++++++++++-
 include/linux/rculist.h | 80 ++++++++++++++++++++++-------------------
 lib/list_debug.c        | 79 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 162 insertions(+), 38 deletions(-)

diff --git a/include/linux/list.h b/include/linux/list.h
index 2bef081afa69..96ede36a5614 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -41,6 +41,13 @@ extern bool __list_add_valid(struct list_head *new,
 			      struct list_head *prev,
 			      struct list_head *next);
 extern bool __list_del_entry_valid(struct list_head *entry);
+extern bool __hlist_add_before_valid(struct hlist_node *new,
+				     struct hlist_node *next);
+extern bool __hlist_add_behind_valid(struct hlist_node *new,
+				     struct hlist_node *prev);
+extern bool __hlist_add_head_valid(struct hlist_node *new,
+				   struct hlist_head *head);
+extern bool __hlist_del_valid(struct hlist_node *node);
 #else
 static inline bool __list_add_valid(struct list_head *new,
 				struct list_head *prev,
@@ -52,6 +59,25 @@ static inline bool __list_del_entry_valid(struct list_head *entry)
 {
 	return true;
 }
+static inline bool __hlist_add_before_valid(struct hlist_node *new,
+					    struct hlist_node *next)
+{
+	return true;
+}
+static inline bool __hlist_add_behind_valid(struct hlist_node *new,
+					    struct hlist_node *prev)
+{
+	return true;
+}
+static inline bool __hlist_add_head_valid(struct hlist_node *new,
+					  struct hlist_head *head)
+{
+	return true;
+}
+static inline bool __hlist_del_valid(struct hlist_node *node)
+{
+	return true;
+}
 #endif
 
 /*
@@ -796,6 +822,9 @@ static inline void __hlist_del(struct hlist_node *n)
 	struct hlist_node *next = n->next;
 	struct hlist_node **pprev = n->pprev;
 
+	if (!__hlist_del_valid(n))
+		return;
+
 	WRITE_ONCE(*pprev, next);
 	if (next)
 		WRITE_ONCE(next->pprev, pprev);
@@ -840,6 +869,10 @@ static inline void hlist_del_init(struct hlist_node *n)
 static inline void hlist_add_head(struct hlist_node *n, struct hlist_head *h)
 {
 	struct hlist_node *first = h->first;
+
+	if (!__hlist_add_head_valid(n, h))
+		return;
+
 	n->next = first;
 	if (first)
 		first->pprev = &n->next;
@@ -855,6 +888,9 @@ static inline void hlist_add_head(struct hlist_node *n, struct hlist_head *h)
 static inline void hlist_add_before(struct hlist_node *n,
 				    struct hlist_node *next)
 {
+	if (!__hlist_add_before_valid(n, next))
+		return;
+
 	n->pprev = next->pprev;
 	n->next = next;
 	next->pprev = &n->next;
@@ -862,13 +898,16 @@ static inline void hlist_add_before(struct hlist_node *n,
 }
 
 /**
- * hlist_add_behing - add a new entry after the one specified
+ * hlist_add_behind - add a new entry after the one specified
  * @n: new entry to be added
  * @prev: hlist node to add it after, which must be non-NULL
  */
 static inline void hlist_add_behind(struct hlist_node *n,
 				    struct hlist_node *prev)
 {
+	if (!__hlist_add_behind_valid(n, prev))
+		return;
+
 	n->next = prev->next;
 	prev->next = n;
 	n->pprev = &prev->next;
diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 9f313e4999fe..6f3eb7758fd8 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -537,6 +537,9 @@ static inline void hlist_add_head_rcu(struct hlist_node *n,
 {
 	struct hlist_node *first = h->first;
 
+	if (!__hlist_add_head_valid(n, h))
+		return;
+
 	n->next = first;
 	WRITE_ONCE(n->pprev, &h->first);
 	rcu_assign_pointer(hlist_first_rcu(h), n);
@@ -544,43 +547,6 @@ static inline void hlist_add_head_rcu(struct hlist_node *n,
 		WRITE_ONCE(first->pprev, &n->next);
 }
 
-/**
- * hlist_add_tail_rcu
- * @n: the element to add to the hash list.
- * @h: the list to add to.
- *
- * Description:
- * Adds the specified element to the specified hlist,
- * while permitting racing traversals.
- *
- * The caller must take whatever precautions are necessary
- * (such as holding appropriate locks) to avoid racing
- * with another list-mutation primitive, such as hlist_add_head_rcu()
- * or hlist_del_rcu(), running on this same list.
- * However, it is perfectly legal to run concurrently with
- * the _rcu list-traversal primitives, such as
- * hlist_for_each_entry_rcu(), used to prevent memory-consistency
- * problems on Alpha CPUs.  Regardless of the type of CPU, the
- * list-traversal primitive must be guarded by rcu_read_lock().
- */
-static inline void hlist_add_tail_rcu(struct hlist_node *n,
-				      struct hlist_head *h)
-{
-	struct hlist_node *i, *last = NULL;
-
-	/* Note: write side code, so rcu accessors are not needed. */
-	for (i = h->first; i; i = i->next)
-		last = i;
-
-	if (last) {
-		n->next = last->next;
-		WRITE_ONCE(n->pprev, &last->next);
-		rcu_assign_pointer(hlist_next_rcu(last), n);
-	} else {
-		hlist_add_head_rcu(n, h);
-	}
-}
-
 /**
  * hlist_add_before_rcu
  * @n: the new element to add to the hash list.
@@ -602,6 +568,9 @@ static inline void hlist_add_tail_rcu(struct hlist_node *n,
 static inline void hlist_add_before_rcu(struct hlist_node *n,
 					struct hlist_node *next)
 {
+	if (!__hlist_add_before_valid(n, next))
+		return;
+
 	WRITE_ONCE(n->pprev, next->pprev);
 	n->next = next;
 	rcu_assign_pointer(hlist_pprev_rcu(n), n);
@@ -629,6 +598,9 @@ static inline void hlist_add_before_rcu(struct hlist_node *n,
 static inline void hlist_add_behind_rcu(struct hlist_node *n,
 					struct hlist_node *prev)
 {
+	if (!__hlist_add_behind_valid(n, prev))
+		return;
+
 	n->next = prev->next;
 	WRITE_ONCE(n->pprev, &prev->next);
 	rcu_assign_pointer(hlist_next_rcu(prev), n);
@@ -636,6 +608,40 @@ static inline void hlist_add_behind_rcu(struct hlist_node *n,
 		WRITE_ONCE(n->next->pprev, &n->next);
 }
 
+/**
+ * hlist_add_tail_rcu
+ * @n: the element to add to the hash list.
+ * @h: the list to add to.
+ *
+ * Description:
+ * Adds the specified element to the specified hlist,
+ * while permitting racing traversals.
+ *
+ * The caller must take whatever precautions are necessary
+ * (such as holding appropriate locks) to avoid racing
+ * with another list-mutation primitive, such as hlist_add_head_rcu()
+ * or hlist_del_rcu(), running on this same list.
+ * However, it is perfectly legal to run concurrently with
+ * the _rcu list-traversal primitives, such as
+ * hlist_for_each_entry_rcu(), used to prevent memory-consistency
+ * problems on Alpha CPUs.  Regardless of the type of CPU, the
+ * list-traversal primitive must be guarded by rcu_read_lock().
+ */
+static inline void hlist_add_tail_rcu(struct hlist_node *n,
+				      struct hlist_head *h)
+{
+	struct hlist_node *i, *last = NULL;
+
+	/* Note: write side code, so rcu accessors are not needed. */
+	for (i = h->first; i; i = i->next)
+		last = i;
+
+	if (last)
+		hlist_add_behind_rcu(n, last);
+	else
+		hlist_add_head_rcu(n, h);
+}
+
 #define __hlist_for_each_rcu(pos, head)				\
 	for (pos = rcu_dereference(hlist_first_rcu(head));	\
 	     pos;						\
diff --git a/lib/list_debug.c b/lib/list_debug.c
index 57bf685af2ef..03234ebd18c9 100644
--- a/lib/list_debug.c
+++ b/lib/list_debug.c
@@ -60,3 +60,82 @@ bool __list_del_entry_valid(struct list_head *entry)
 
 }
 EXPORT_SYMBOL(__list_del_entry_valid);
+
+static bool __hlist_add_valid(struct hlist_node *new, struct hlist_node *prev,
+			      struct hlist_node *next)
+{
+	if (CHECK_DATA_CORRUPTION(next && next->pprev != &prev->next,
+			"hlist_add corruption: next->pprev should be &prev->next (%px), but was %px (next=%px)\n",
+			&prev->next, next->pprev, next) ||
+	    CHECK_DATA_CORRUPTION(prev->next != next,
+			"hlist_add corruption: prev->next should be next (%px), but was %px (prev=%px)\n",
+			next, prev->next, prev) ||
+	    CHECK_DATA_CORRUPTION(new == prev || new == next,
+			"hlist_add double add: new=%px, prev=%px, next=%px\n",
+			new, prev, next))
+		return false;
+
+	return true;
+}
+
+bool __hlist_add_before_valid(struct hlist_node *new, struct hlist_node *next)
+{
+	struct hlist_node *prev;
+
+	prev = container_of(next->pprev, struct hlist_node, next);
+	return __hlist_add_valid(new, prev, next);
+}
+EXPORT_SYMBOL(__hlist_add_before_valid);
+
+bool __hlist_add_behind_valid(struct hlist_node *new, struct hlist_node *prev)
+{
+	return __hlist_add_valid(new, prev, prev->next);
+}
+EXPORT_SYMBOL(__hlist_add_behind_valid);
+
+bool __hlist_add_head_valid(struct hlist_node *new, struct hlist_head *head)
+{
+	struct hlist_node *first = head->first;
+
+	if (CHECK_DATA_CORRUPTION(first && first->pprev != &head->first,
+			"hlist_add_head corruption: first->pprev should be &head->first (%px), but was %px (first=%px)",
+			&head->first, first->pprev, first) ||
+	    CHECK_DATA_CORRUPTION(new == first,
+			"hlist_add_head double add: new (%px) == first (%px)",
+			new, first))
+		return false;
+
+	return true;
+}
+EXPORT_SYMBOL(__hlist_add_head_valid);
+
+bool __hlist_del_valid(struct hlist_node *node)
+{
+	struct hlist_node *prev, *next = node->next;
+
+	if (CHECK_DATA_CORRUPTION(next == LIST_POISON1,
+			"hlist_del corruption: %px->next is LIST_POISON1 (%px)\n",
+			node, LIST_POISON1) ||
+	    CHECK_DATA_CORRUPTION(node->pprev == LIST_POISON2,
+			"hlist_del corruption: %px->pprev is LIST_POISON2 (%px)\n",
+			node, LIST_POISON2))
+		return false;
+
+	/*
+	 * If we want to validate the previous node's forward linkage,
+	 * then we must be able to treat the head like a normal node.
+	 */
+	BUILD_BUG_ON(offsetof(struct hlist_node, next) !=
+		     offsetof(struct hlist_head, first));
+	prev = container_of(node->pprev, struct hlist_node, next);
+	if (CHECK_DATA_CORRUPTION(prev->next != node,
+			"hlist_del corruption: prev->next should be %px, but was %px\n",
+			node, prev->next) ||
+	    CHECK_DATA_CORRUPTION(next && next->pprev != &node->next,
+			"hlist_del corruption: next->pprev should be %px, but was %px\n",
+			&node->next, next->pprev))
+		return false;
+
+	return true;
+}
+EXPORT_SYMBOL(__hlist_del_valid);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 12/21] list: Poison ->next pointer for non-RCU deletion of 'hlist_nulls_node'
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
                   ` (10 preceding siblings ...)
  2020-03-24 15:36 ` [RFC PATCH 11/21] list: Add integrity checking to hlist implementation Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-30 23:32   ` Paul E. McKenney
  2020-03-24 15:36 ` [RFC PATCH 13/21] list: Add integrity checking to hlist_nulls implementation Will Deacon
                   ` (8 subsequent siblings)
  20 siblings, 1 reply; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

hlist_nulls_del() is used for non-RCU deletion of an 'hlist_nulls_node'
and so we can safely poison the ->next pointer without having to worry
about concurrent readers, just like we do for other non-RCU list
deletion primitives

Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/list_nulls.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/list_nulls.h b/include/linux/list_nulls.h
index fd154ceb5b0d..48f33ae16381 100644
--- a/include/linux/list_nulls.h
+++ b/include/linux/list_nulls.h
@@ -99,6 +99,7 @@ static inline void __hlist_nulls_del(struct hlist_nulls_node *n)
 static inline void hlist_nulls_del(struct hlist_nulls_node *n)
 {
 	__hlist_nulls_del(n);
+	n->next = LIST_POISON1;
 	n->pprev = LIST_POISON2;
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 13/21] list: Add integrity checking to hlist_nulls implementation
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
                   ` (11 preceding siblings ...)
  2020-03-24 15:36 ` [RFC PATCH 12/21] list: Poison ->next pointer for non-RCU deletion of 'hlist_nulls_node' Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-24 15:36 ` [RFC PATCH 14/21] plist: Use CHECK_DATA_CORRUPTION instead of explicit {BUG,WARN}_ON() Will Deacon
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

Extend the 'hlist_nulls' implementation so that it can optionally
perform integrity checking in a similar fashion to the standard 'list'
code when CONFIG_CHECK_INTEGRITY_LIST=y.

On architectures without a trap value for ILLEGAL_POINTER_VALUE (i.e.
all 32-bit architectures), explicit pointer/poison checks can help to
mitigate UAF vulnerabilities such as the one exploited by "pingpongroot"
(CVE-2015-3636).

Cc: Kees Cook <keescook@chromium.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/list_nulls.h | 23 +++++++++++++++++++
 lib/list_debug.c           | 47 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 70 insertions(+)

diff --git a/include/linux/list_nulls.h b/include/linux/list_nulls.h
index 48f33ae16381..379e097e92b0 100644
--- a/include/linux/list_nulls.h
+++ b/include/linux/list_nulls.h
@@ -35,6 +35,23 @@ struct hlist_nulls_node {
 	({ typeof(ptr) ____ptr = (ptr); \
 	   !is_a_nulls(____ptr) ? hlist_nulls_entry(____ptr, type, member) : NULL; \
 	})
+
+#ifdef CONFIG_CHECK_INTEGRITY_LIST
+extern bool __hlist_nulls_add_head_valid(struct hlist_nulls_node *new,
+					 struct hlist_nulls_head *head);
+extern bool __hlist_nulls_del_valid(struct hlist_nulls_node *node);
+#else
+static inline bool __hlist_nulls_add_head_valid(struct hlist_nulls_node *new,
+						struct hlist_nulls_head *head)
+{
+	return true;
+}
+static inline bool __hlist_nulls_del_valid(struct hlist_nulls_node *node)
+{
+	return true;
+}
+#endif
+
 /**
  * ptr_is_a_nulls - Test if a ptr is a nulls
  * @ptr: ptr to be tested
@@ -79,6 +96,9 @@ static inline void hlist_nulls_add_head(struct hlist_nulls_node *n,
 {
 	struct hlist_nulls_node *first = h->first;
 
+	if (!__hlist_nulls_add_head_valid(n, h))
+		return;
+
 	n->next = first;
 	n->pprev = &h->first;
 	h->first = n;
@@ -91,6 +111,9 @@ static inline void __hlist_nulls_del(struct hlist_nulls_node *n)
 	struct hlist_nulls_node *next = n->next;
 	struct hlist_nulls_node **pprev = n->pprev;
 
+	if (!__hlist_nulls_del_valid(n))
+		return;
+
 	WRITE_ONCE(*pprev, next);
 	if (!is_a_nulls(next))
 		WRITE_ONCE(next->pprev, pprev);
diff --git a/lib/list_debug.c b/lib/list_debug.c
index 03234ebd18c9..b3560de4accc 100644
--- a/lib/list_debug.c
+++ b/lib/list_debug.c
@@ -10,6 +10,7 @@
 #include <linux/bug.h>
 #include <linux/kernel.h>
 #include <linux/rculist.h>
+#include <linux/rculist_nulls.h>
 
 /*
  * Check that the data structures for the list manipulations are reasonably
@@ -139,3 +140,49 @@ bool __hlist_del_valid(struct hlist_node *node)
 	return true;
 }
 EXPORT_SYMBOL(__hlist_del_valid);
+
+bool __hlist_nulls_add_head_valid(struct hlist_nulls_node *new,
+				  struct hlist_nulls_head *head)
+{
+	struct hlist_nulls_node *first = head->first;
+
+	if (CHECK_DATA_CORRUPTION(!is_a_nulls(first) &&
+				  first->pprev != &head->first,
+			"hlist_nulls_add_head corruption: first->pprev should be &head->first (%px), but was %px (first=%px)",
+			&head->first, first->pprev, first) ||
+	    CHECK_DATA_CORRUPTION(new == first,
+			"hlist_nulls_add_head double add: new (%px) == first (%px)",
+			new, first))
+		return false;
+
+	return true;
+}
+EXPORT_SYMBOL(__hlist_nulls_add_head_valid);
+
+bool __hlist_nulls_del_valid(struct hlist_nulls_node *node)
+{
+	struct hlist_nulls_node *prev, *next = node->next;
+
+	if (CHECK_DATA_CORRUPTION(next == LIST_POISON1,
+			"hlist_nulls_del corruption: %px->next is LIST_POISON1 (%px)\n",
+			node, LIST_POISON1) ||
+	    CHECK_DATA_CORRUPTION(node->pprev == LIST_POISON2,
+			"hlist_nulls_del corruption: %px->pprev is LIST_POISON2 (%px)\n",
+			node, LIST_POISON2))
+		return false;
+
+	BUILD_BUG_ON(offsetof(struct hlist_nulls_node, next) !=
+		     offsetof(struct hlist_nulls_head, first));
+	prev = container_of(node->pprev, struct hlist_nulls_node, next);
+	if (CHECK_DATA_CORRUPTION(prev->next != node,
+			"hlist_nulls_del corruption: prev->next should be %px, but was %px\n",
+			node, prev->next) ||
+	    CHECK_DATA_CORRUPTION(!is_a_nulls(next) &&
+				  next->pprev != &node->next,
+			"hlist_nulls_del corruption: next->pprev should be %px, but was %px\n",
+			&node->next, next->pprev))
+		return false;
+
+	return true;
+}
+EXPORT_SYMBOL(__hlist_nulls_del_valid);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 14/21] plist: Use CHECK_DATA_CORRUPTION instead of explicit {BUG,WARN}_ON()
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
                   ` (12 preceding siblings ...)
  2020-03-24 15:36 ` [RFC PATCH 13/21] list: Add integrity checking to hlist_nulls implementation Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-24 16:42   ` Greg KH
  2020-03-24 15:36 ` [RFC PATCH 15/21] list_bl: Use CHECK_DATA_CORRUPTION instead of custom BUG_ON() wrapper Will Deacon
                   ` (6 subsequent siblings)
  20 siblings, 1 reply; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

CHECK_DATA_CORRUPTION() allows detected data corruption to result
consistently in either a BUG() or a WARN() depending on
CONFIG_BUG_ON_DATA_CORRUPTION.

Use CHECK_DATA_CORRUPTION() to report plist integrity checking failures,
rather than explicit use of BUG_ON() and WARN_ON().

Cc: Kees Cook <keescook@chromium.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 lib/plist.c | 63 ++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 40 insertions(+), 23 deletions(-)

diff --git a/lib/plist.c b/lib/plist.c
index c06e98e78259..eb127eaab235 100644
--- a/lib/plist.c
+++ b/lib/plist.c
@@ -29,39 +29,46 @@
 
 static struct plist_head test_head;
 
-static void plist_check_prev_next(struct list_head *t, struct list_head *p,
-				  struct list_head *n)
+static bool plist_prev_next_invalid(struct list_head *t, struct list_head *p,
+				    struct list_head *n)
 {
-	WARN(n->prev != p || p->next != n,
-			"top: %p, n: %p, p: %p\n"
-			"prev: %p, n: %p, p: %p\n"
-			"next: %p, n: %p, p: %p\n",
+	return CHECK_DATA_CORRUPTION(n->prev != p || p->next != n,
+			"plist corruption:\n"
+			"\ttop: %p, n: %p, p: %p\n"
+			"\tprev: %p, n: %p, p: %p\n"
+			"\tnext: %p, n: %p, p: %p\n",
 			 t, t->next, t->prev,
 			p, p->next, p->prev,
 			n, n->next, n->prev);
 }
 
-static void plist_check_list(struct list_head *top)
+static bool plist_list_invalid(struct list_head *top)
 {
 	struct list_head *prev = top, *next = top->next;
+	bool corruption;
 
-	plist_check_prev_next(top, prev, next);
+	corruption = plist_prev_next_invalid(top, prev, next);
 	while (next != top) {
 		prev = next;
 		next = prev->next;
-		plist_check_prev_next(top, prev, next);
+		corruption |= plist_prev_next_invalid(top, prev, next);
 	}
+
+	return corruption;
 }
 
-static void plist_check_head(struct plist_head *head)
+static bool plist_head_valid(struct plist_head *head)
 {
+	bool corruption = false;
+
 	if (!plist_head_empty(head))
-		plist_check_list(&plist_first(head)->prio_list);
-	plist_check_list(&head->node_list);
+		corruption |= plist_list_invalid(&plist_first(head)->prio_list);
+	corruption |= plist_list_invalid(&head->node_list);
+	return !corruption;
 }
 
 #else
-# define plist_check_head(h)	do { } while (0)
+# define plist_head_valid(h)	(true)
 #endif
 
 /**
@@ -75,9 +82,12 @@ void plist_add(struct plist_node *node, struct plist_head *head)
 	struct plist_node *first, *iter, *prev = NULL;
 	struct list_head *node_next = &head->node_list;
 
-	plist_check_head(head);
-	WARN_ON(!plist_node_empty(node));
-	WARN_ON(!list_empty(&node->prio_list));
+	if (!plist_head_valid(head) ||
+	    CHECK_DATA_CORRUPTION(!plist_node_empty(node),
+			"plist_add corruption. node list is not empty.\n") ||
+	    CHECK_DATA_CORRUPTION(!list_empty(&node->prio_list),
+			"plist_add corruption. node prio list is not empty.\n"))
+		return;
 
 	if (plist_head_empty(head))
 		goto ins_node;
@@ -100,7 +110,8 @@ void plist_add(struct plist_node *node, struct plist_head *head)
 ins_node:
 	list_add_tail(&node->node_list, node_next);
 
-	plist_check_head(head);
+	if (!plist_head_valid(head))
+		return;
 }
 
 /**
@@ -111,7 +122,8 @@ void plist_add(struct plist_node *node, struct plist_head *head)
  */
 void plist_del(struct plist_node *node, struct plist_head *head)
 {
-	plist_check_head(head);
+	if (!plist_head_valid(head))
+		return;
 
 	if (!list_empty(&node->prio_list)) {
 		if (node->node_list.next != &head->node_list) {
@@ -129,7 +141,8 @@ void plist_del(struct plist_node *node, struct plist_head *head)
 
 	list_del_init(&node->node_list);
 
-	plist_check_head(head);
+	if (!plist_head_valid(head))
+		return;
 }
 
 /**
@@ -147,9 +160,12 @@ void plist_requeue(struct plist_node *node, struct plist_head *head)
 	struct plist_node *iter;
 	struct list_head *node_next = &head->node_list;
 
-	plist_check_head(head);
-	BUG_ON(plist_head_empty(head));
-	BUG_ON(plist_node_empty(node));
+	if (!plist_head_valid(head) ||
+	    CHECK_DATA_CORRUPTION(plist_head_empty(head),
+			"plist_requeue corruption. head list is empty.\n") ||
+	    CHECK_DATA_CORRUPTION(plist_node_empty(node),
+			"plist_requeue corruption. node list is empty.\n"))
+		return;
 
 	if (node == plist_last(head))
 		return;
@@ -169,7 +185,8 @@ void plist_requeue(struct plist_node *node, struct plist_head *head)
 	}
 	list_add_tail(&node->node_list, node_next);
 
-	plist_check_head(head);
+	if (!plist_head_valid(head))
+		return;
 }
 
 #ifdef CONFIG_CHECK_INTEGRITY_PLIST
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 15/21] list_bl: Use CHECK_DATA_CORRUPTION instead of custom BUG_ON() wrapper
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
                   ` (13 preceding siblings ...)
  2020-03-24 15:36 ` [RFC PATCH 14/21] plist: Use CHECK_DATA_CORRUPTION instead of explicit {BUG,WARN}_ON() Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-24 15:36 ` [RFC PATCH 16/21] list_bl: Extend integrity checking in deletion routines Will Deacon
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

CHECK_DATA_CORRUPTION() allows detected data corruption to result
consistently in either a BUG() or a WARN() depending on
CONFIG_BUG_ON_DATA_CORRUPTION.

Use CHECK_DATA_CORRUPTION() to report list_bl integrity checking failures,
rather than a custom wrapper around BUG_ON().

Cc: Kees Cook <keescook@chromium.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/list_bl.h    | 55 +++++++++++++++++++++++++-------------
 include/linux/rculist_bl.h | 17 ++++--------
 2 files changed, 42 insertions(+), 30 deletions(-)

diff --git a/include/linux/list_bl.h b/include/linux/list_bl.h
index 9f8e29142324..f48d8acb15b4 100644
--- a/include/linux/list_bl.h
+++ b/include/linux/list_bl.h
@@ -24,13 +24,6 @@
 #define LIST_BL_LOCKMASK	0UL
 #endif
 
-#ifdef CONFIG_CHECK_INTEGRITY_LIST
-#define LIST_BL_BUG_ON(x) BUG_ON(x)
-#else
-#define LIST_BL_BUG_ON(x)
-#endif
-
-
 struct hlist_bl_head {
 	struct hlist_bl_node *first;
 };
@@ -38,6 +31,37 @@ struct hlist_bl_head {
 struct hlist_bl_node {
 	struct hlist_bl_node *next, **pprev;
 };
+
+#ifdef CONFIG_CHECK_INTEGRITY_LIST
+static inline bool __hlist_bl_add_head_valid(struct hlist_bl_head *h,
+					     struct hlist_bl_node *n)
+{
+	unsigned long hlock = (unsigned long)h->first & LIST_BL_LOCKMASK;
+	unsigned long nlock = (unsigned long)n & LIST_BL_LOCKMASK;
+
+	return !(CHECK_DATA_CORRUPTION(nlock,
+			"hlist_bl_add_head: node is locked\n") ||
+		 CHECK_DATA_CORRUPTION(hlock != LIST_BL_LOCKMASK,
+			"hlist_bl_add_head: head is unlocked\n"));
+}
+
+static inline bool __hlist_bl_del_valid(struct hlist_bl_node *n)
+{
+	unsigned long nlock = (unsigned long)n & LIST_BL_LOCKMASK;
+	return !CHECK_DATA_CORRUPTION(nlock, "hlist_bl_del_valid: node locked");
+}
+#else
+static inline bool __hlist_bl_add_head_valid(struct hlist_bl_head *h,
+					     struct hlist_bl_node *n)
+{
+	return true;
+}
+static inline bool __hlist_bl_del_valid(struct hlist_bl_node *n)
+{
+	return true;
+}
+#endif
+
 #define INIT_HLIST_BL_HEAD(ptr) \
 	((ptr)->first = NULL)
 
@@ -60,15 +84,6 @@ static inline struct hlist_bl_node *hlist_bl_first(struct hlist_bl_head *h)
 		((unsigned long)h->first & ~LIST_BL_LOCKMASK);
 }
 
-static inline void hlist_bl_set_first(struct hlist_bl_head *h,
-					struct hlist_bl_node *n)
-{
-	LIST_BL_BUG_ON((unsigned long)n & LIST_BL_LOCKMASK);
-	LIST_BL_BUG_ON(((unsigned long)h->first & LIST_BL_LOCKMASK) !=
-							LIST_BL_LOCKMASK);
-	h->first = (struct hlist_bl_node *)((unsigned long)n | LIST_BL_LOCKMASK);
-}
-
 static inline bool hlist_bl_empty(const struct hlist_bl_head *h)
 {
 	unsigned long first = data_race((unsigned long)READ_ONCE(h->first));
@@ -80,11 +95,14 @@ static inline void hlist_bl_add_head(struct hlist_bl_node *n,
 {
 	struct hlist_bl_node *first = hlist_bl_first(h);
 
+	if (!__hlist_bl_add_head_valid(h, n))
+		return;
+
 	n->next = first;
 	if (first)
 		first->pprev = &n->next;
 	n->pprev = &h->first;
-	hlist_bl_set_first(h, n);
+	h->first = (struct hlist_bl_node *)((unsigned long)n | LIST_BL_LOCKMASK);
 }
 
 static inline void hlist_bl_add_before(struct hlist_bl_node *n,
@@ -118,7 +136,8 @@ static inline void __hlist_bl_del(struct hlist_bl_node *n)
 	struct hlist_bl_node *next = n->next;
 	struct hlist_bl_node **pprev = n->pprev;
 
-	LIST_BL_BUG_ON((unsigned long)n & LIST_BL_LOCKMASK);
+	if (!__hlist_bl_del_valid(n))
+		return;
 
 	/* pprev may be `first`, so be careful not to lose the lock bit */
 	WRITE_ONCE(*pprev,
diff --git a/include/linux/rculist_bl.h b/include/linux/rculist_bl.h
index 0b952d06eb0b..553ce3cde104 100644
--- a/include/linux/rculist_bl.h
+++ b/include/linux/rculist_bl.h
@@ -8,16 +8,6 @@
 #include <linux/list_bl.h>
 #include <linux/rcupdate.h>
 
-static inline void hlist_bl_set_first_rcu(struct hlist_bl_head *h,
-					struct hlist_bl_node *n)
-{
-	LIST_BL_BUG_ON((unsigned long)n & LIST_BL_LOCKMASK);
-	LIST_BL_BUG_ON(((unsigned long)h->first & LIST_BL_LOCKMASK) !=
-							LIST_BL_LOCKMASK);
-	rcu_assign_pointer(h->first,
-		(struct hlist_bl_node *)((unsigned long)n | LIST_BL_LOCKMASK));
-}
-
 static inline struct hlist_bl_node *hlist_bl_first_rcu(struct hlist_bl_head *h)
 {
 	return (struct hlist_bl_node *)
@@ -73,6 +63,9 @@ static inline void hlist_bl_add_head_rcu(struct hlist_bl_node *n,
 {
 	struct hlist_bl_node *first;
 
+	if (!__hlist_bl_add_head_valid(h, n))
+		return;
+
 	/* don't need hlist_bl_first_rcu because we're under lock */
 	first = hlist_bl_first(h);
 
@@ -81,8 +74,8 @@ static inline void hlist_bl_add_head_rcu(struct hlist_bl_node *n,
 		first->pprev = &n->next;
 	n->pprev = &h->first;
 
-	/* need _rcu because we can have concurrent lock free readers */
-	hlist_bl_set_first_rcu(h, n);
+	rcu_assign_pointer(h->first,
+		(struct hlist_bl_node *)((unsigned long)n | LIST_BL_LOCKMASK));
 }
 /**
  * hlist_bl_for_each_entry_rcu - iterate over rcu list of given type
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 16/21] list_bl: Extend integrity checking in deletion routines
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
                   ` (14 preceding siblings ...)
  2020-03-24 15:36 ` [RFC PATCH 15/21] list_bl: Use CHECK_DATA_CORRUPTION instead of custom BUG_ON() wrapper Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-24 15:36 ` [RFC PATCH 17/21] linux/bit_spinlock.h: Include linux/processor.h Will Deacon
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

Although deleting an entry from an 'hlist_bl' optionally checks that the
node being removed is unlocked before subsequently removing it and
poisoning its pointers, we don't actually check for the poison values
like we do for other list implementations.

Add poison checks to __hlist_bl_del_valid() so that we can catch list
corruption without relying on a later fault.

Cc: Kees Cook <keescook@chromium.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/list_bl.h | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/linux/list_bl.h b/include/linux/list_bl.h
index f48d8acb15b4..0839c4f43e6d 100644
--- a/include/linux/list_bl.h
+++ b/include/linux/list_bl.h
@@ -48,7 +48,15 @@ static inline bool __hlist_bl_add_head_valid(struct hlist_bl_head *h,
 static inline bool __hlist_bl_del_valid(struct hlist_bl_node *n)
 {
 	unsigned long nlock = (unsigned long)n & LIST_BL_LOCKMASK;
-	return !CHECK_DATA_CORRUPTION(nlock, "hlist_bl_del_valid: node locked");
+
+	return !(CHECK_DATA_CORRUPTION(nlock,
+			"hlist_bl_del_valid: node locked") ||
+		 CHECK_DATA_CORRUPTION(n->next == LIST_POISON1,
+			"hlist_bl_del corruption, %px->next is LIST_POISON1 (%px)\n",
+			n, LIST_POISON1) ||
+		 CHECK_DATA_CORRUPTION(n->pprev == LIST_POISON2,
+			"hlist_bl_del corruption, %px->pprev is LIST_POISON2 (%px)\n",
+			n, LIST_POISON2));
 }
 #else
 static inline bool __hlist_bl_add_head_valid(struct hlist_bl_head *h,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 17/21] linux/bit_spinlock.h: Include linux/processor.h
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
                   ` (15 preceding siblings ...)
  2020-03-24 15:36 ` [RFC PATCH 16/21] list_bl: Extend integrity checking in deletion routines Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-24 16:28   ` Greg KH
  2020-03-24 15:36 ` [RFC PATCH 18/21] list_bl: Move integrity checking out of line Will Deacon
                   ` (3 subsequent siblings)
  20 siblings, 1 reply; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

Needed for cpu_relax().

Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/bit_spinlock.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/bit_spinlock.h b/include/linux/bit_spinlock.h
index bbc4730a6505..505daa942527 100644
--- a/include/linux/bit_spinlock.h
+++ b/include/linux/bit_spinlock.h
@@ -6,6 +6,7 @@
 #include <linux/preempt.h>
 #include <linux/atomic.h>
 #include <linux/bug.h>
+#include <linux/processor.h>
 
 /*
  *  bit-based spin_lock()
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 18/21] list_bl: Move integrity checking out of line
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
                   ` (16 preceding siblings ...)
  2020-03-24 15:36 ` [RFC PATCH 17/21] linux/bit_spinlock.h: Include linux/processor.h Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-24 15:36 ` [RFC PATCH 19/21] list_bl: Extend integrity checking to cover the same cases as 'hlist' Will Deacon
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

In preparation for adding full integrity checking to 'hlist_bl', move
the checks out of line before they become more bloated. At the same,
swap the argument order for __hlist_bl_add_head_valid() so that it
follows the same convention as __hlist_add_head_valid().

Signed-off-by: Will Deacon <will@kernel.org>
---
 include/linux/list_bl.h    | 34 ++++++----------------------------
 include/linux/rculist_bl.h |  2 +-
 lib/list_debug.c           | 28 ++++++++++++++++++++++++++++
 3 files changed, 35 insertions(+), 29 deletions(-)

diff --git a/include/linux/list_bl.h b/include/linux/list_bl.h
index 0839c4f43e6d..dd74043ebae6 100644
--- a/include/linux/list_bl.h
+++ b/include/linux/list_bl.h
@@ -33,34 +33,12 @@ struct hlist_bl_node {
 };
 
 #ifdef CONFIG_CHECK_INTEGRITY_LIST
-static inline bool __hlist_bl_add_head_valid(struct hlist_bl_head *h,
-					     struct hlist_bl_node *n)
-{
-	unsigned long hlock = (unsigned long)h->first & LIST_BL_LOCKMASK;
-	unsigned long nlock = (unsigned long)n & LIST_BL_LOCKMASK;
-
-	return !(CHECK_DATA_CORRUPTION(nlock,
-			"hlist_bl_add_head: node is locked\n") ||
-		 CHECK_DATA_CORRUPTION(hlock != LIST_BL_LOCKMASK,
-			"hlist_bl_add_head: head is unlocked\n"));
-}
-
-static inline bool __hlist_bl_del_valid(struct hlist_bl_node *n)
-{
-	unsigned long nlock = (unsigned long)n & LIST_BL_LOCKMASK;
-
-	return !(CHECK_DATA_CORRUPTION(nlock,
-			"hlist_bl_del_valid: node locked") ||
-		 CHECK_DATA_CORRUPTION(n->next == LIST_POISON1,
-			"hlist_bl_del corruption, %px->next is LIST_POISON1 (%px)\n",
-			n, LIST_POISON1) ||
-		 CHECK_DATA_CORRUPTION(n->pprev == LIST_POISON2,
-			"hlist_bl_del corruption, %px->pprev is LIST_POISON2 (%px)\n",
-			n, LIST_POISON2));
-}
+extern bool __hlist_bl_add_head_valid(struct hlist_bl_node *n,
+				      struct hlist_bl_head *h);
+extern bool __hlist_bl_del_valid(struct hlist_bl_node *node);
 #else
-static inline bool __hlist_bl_add_head_valid(struct hlist_bl_head *h,
-					     struct hlist_bl_node *n)
+static inline bool __hlist_bl_add_head_valid(struct hlist_bl_node *n,
+					     struct hlist_bl_head *h)
 {
 	return true;
 }
@@ -103,7 +81,7 @@ static inline void hlist_bl_add_head(struct hlist_bl_node *n,
 {
 	struct hlist_bl_node *first = hlist_bl_first(h);
 
-	if (!__hlist_bl_add_head_valid(h, n))
+	if (!__hlist_bl_add_head_valid(n, h))
 		return;
 
 	n->next = first;
diff --git a/include/linux/rculist_bl.h b/include/linux/rculist_bl.h
index 553ce3cde104..9356e7283ff0 100644
--- a/include/linux/rculist_bl.h
+++ b/include/linux/rculist_bl.h
@@ -63,7 +63,7 @@ static inline void hlist_bl_add_head_rcu(struct hlist_bl_node *n,
 {
 	struct hlist_bl_node *first;
 
-	if (!__hlist_bl_add_head_valid(h, n))
+	if (!__hlist_bl_add_head_valid(n, h))
 		return;
 
 	/* don't need hlist_bl_first_rcu because we're under lock */
diff --git a/lib/list_debug.c b/lib/list_debug.c
index b3560de4accc..9591fa6c9337 100644
--- a/lib/list_debug.c
+++ b/lib/list_debug.c
@@ -186,3 +186,31 @@ bool __hlist_nulls_del_valid(struct hlist_nulls_node *node)
 	return true;
 }
 EXPORT_SYMBOL(__hlist_nulls_del_valid);
+
+bool __hlist_bl_add_head_valid(struct hlist_bl_node *new,
+			       struct hlist_bl_head *head)
+{
+	unsigned long hlock = (unsigned long)head->first & LIST_BL_LOCKMASK;
+	unsigned long nlock = (unsigned long)new & LIST_BL_LOCKMASK;
+
+	return !(CHECK_DATA_CORRUPTION(nlock,
+			"hlist_bl_add_head: node is locked\n") ||
+		 CHECK_DATA_CORRUPTION(hlock != LIST_BL_LOCKMASK,
+			"hlist_bl_add_head: head is unlocked\n"));
+}
+EXPORT_SYMBOL(__hlist_bl_add_head_valid);
+
+bool __hlist_bl_del_valid(struct hlist_bl_node *node)
+{
+	unsigned long nlock = (unsigned long)node & LIST_BL_LOCKMASK;
+
+	return !(CHECK_DATA_CORRUPTION(nlock,
+			"hlist_bl_del_valid: node locked") ||
+		 CHECK_DATA_CORRUPTION(node->next == LIST_POISON1,
+			"hlist_bl_del corruption, %px->next is LIST_POISON1 (%px)\n",
+			node, LIST_POISON1) ||
+		 CHECK_DATA_CORRUPTION(node->pprev == LIST_POISON2,
+			"hlist_bl_del corruption, %px->pprev is LIST_POISON2 (%px)\n",
+			node, LIST_POISON2));
+}
+EXPORT_SYMBOL(__hlist_bl_del_valid);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 19/21] list_bl: Extend integrity checking to cover the same cases as 'hlist'
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
                   ` (17 preceding siblings ...)
  2020-03-24 15:36 ` [RFC PATCH 18/21] list_bl: Move integrity checking out of line Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-24 15:36 ` [RFC PATCH 20/21] list: Format CHECK_DATA_CORRUPTION error messages consistently Will Deacon
  2020-03-24 15:36 ` [RFC PATCH 21/21] lkdtm: Extend list corruption checks Will Deacon
  20 siblings, 0 replies; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

The list integrity checks for 'hlist_bl' are missing a number of cases
that are covered by other list implementations (e.g. 'hlist'), such as
validating 'next' and 'pprev' pointers when adding and deleting nodes.

Extend the list_bl integrity checks to bring them up to the same level
as for other list implementations.

Cc: Kees Cook <keescook@chromium.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 lib/list_debug.c | 48 ++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 40 insertions(+), 8 deletions(-)

diff --git a/lib/list_debug.c b/lib/list_debug.c
index 9591fa6c9337..3be50b5c8014 100644
--- a/lib/list_debug.c
+++ b/lib/list_debug.c
@@ -7,6 +7,7 @@
 
 #include <linux/export.h>
 #include <linux/list.h>
+#include <linux/list_bl.h>
 #include <linux/bug.h>
 #include <linux/kernel.h>
 #include <linux/rculist.h>
@@ -190,27 +191,58 @@ EXPORT_SYMBOL(__hlist_nulls_del_valid);
 bool __hlist_bl_add_head_valid(struct hlist_bl_node *new,
 			       struct hlist_bl_head *head)
 {
+	struct hlist_bl_node *first = hlist_bl_first(head);
 	unsigned long hlock = (unsigned long)head->first & LIST_BL_LOCKMASK;
 	unsigned long nlock = (unsigned long)new & LIST_BL_LOCKMASK;
 
-	return !(CHECK_DATA_CORRUPTION(nlock,
+	if (CHECK_DATA_CORRUPTION(nlock,
 			"hlist_bl_add_head: node is locked\n") ||
-		 CHECK_DATA_CORRUPTION(hlock != LIST_BL_LOCKMASK,
-			"hlist_bl_add_head: head is unlocked\n"));
+	    CHECK_DATA_CORRUPTION(hlock != LIST_BL_LOCKMASK,
+			"hlist_bl_add_head: head is unlocked\n"))
+		return false;
+
+	if (CHECK_DATA_CORRUPTION(first && first->pprev != &head->first,
+			"hlist_bl_add_head corruption: first->pprev should be &head->first (%px), but was %px (first=%px)",
+			&head->first, first->pprev, first) ||
+	    CHECK_DATA_CORRUPTION(new == first,
+			"hlist_bl_add_head double add: new (%px) == first (%px)",
+			new, first))
+		return false;
+
+	return true;
 }
 EXPORT_SYMBOL(__hlist_bl_add_head_valid);
 
 bool __hlist_bl_del_valid(struct hlist_bl_node *node)
 {
+	struct hlist_bl_node *prev, *next = node->next;
 	unsigned long nlock = (unsigned long)node & LIST_BL_LOCKMASK;
+	unsigned long pnext;
 
-	return !(CHECK_DATA_CORRUPTION(nlock,
-			"hlist_bl_del_valid: node locked") ||
-		 CHECK_DATA_CORRUPTION(node->next == LIST_POISON1,
+	if (CHECK_DATA_CORRUPTION(nlock,
+			"hlist_bl_del corruption: node is locked") ||
+	    CHECK_DATA_CORRUPTION(next == LIST_POISON1,
 			"hlist_bl_del corruption, %px->next is LIST_POISON1 (%px)\n",
 			node, LIST_POISON1) ||
-		 CHECK_DATA_CORRUPTION(node->pprev == LIST_POISON2,
+	    CHECK_DATA_CORRUPTION(node->pprev == LIST_POISON2,
 			"hlist_bl_del corruption, %px->pprev is LIST_POISON2 (%px)\n",
-			node, LIST_POISON2));
+			node, LIST_POISON2))
+		return false;
+
+	BUILD_BUG_ON(offsetof(struct hlist_bl_node, next) !=
+		     offsetof(struct hlist_bl_head, first));
+	prev = container_of(node->pprev, struct hlist_bl_node, next);
+	pnext = (unsigned long)prev->next & ~LIST_BL_LOCKMASK;
+	if (CHECK_DATA_CORRUPTION((unsigned long)next & LIST_BL_LOCKMASK,
+			"hlist_bl_del_corruption: node->next is locked") ||
+	    CHECK_DATA_CORRUPTION((struct hlist_bl_node *)pnext != node,
+			"hlist_bl_del corruption: prev->next should be %px, but was %lx\n",
+			node, pnext) ||
+	    CHECK_DATA_CORRUPTION(next && next->pprev != &node->next,
+			"hlist_bl_del corruption: next->pprev should be %px, but was %px\n",
+			&node->next, next->pprev))
+		return false;
+
+	return true;
 }
 EXPORT_SYMBOL(__hlist_bl_del_valid);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 20/21] list: Format CHECK_DATA_CORRUPTION error messages consistently
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
                   ` (18 preceding siblings ...)
  2020-03-24 15:36 ` [RFC PATCH 19/21] list_bl: Extend integrity checking to cover the same cases as 'hlist' Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  2020-03-24 16:40   ` Greg KH
  2020-03-24 15:36 ` [RFC PATCH 21/21] lkdtm: Extend list corruption checks Will Deacon
  20 siblings, 1 reply; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

The error strings printed when list data corruption is detected are
formatted inconsistently.

Satisfy my inner-pedant by consistently using ':' to limit the message
from its prefix and drop the terminating full stops where they exist.

Signed-off-by: Will Deacon <will@kernel.org>
---
 lib/list_debug.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/lib/list_debug.c b/lib/list_debug.c
index 3be50b5c8014..00e414508f93 100644
--- a/lib/list_debug.c
+++ b/lib/list_debug.c
@@ -23,10 +23,10 @@ bool __list_add_valid(struct list_head *new, struct list_head *prev,
 		      struct list_head *next)
 {
 	if (CHECK_DATA_CORRUPTION(next->prev != prev,
-			"list_add corruption. next->prev should be prev (%px), but was %px. (next=%px).\n",
+			"list_add corruption: next->prev should be prev (%px), but was %px (next=%px)\n",
 			prev, next->prev, next) ||
 	    CHECK_DATA_CORRUPTION(prev->next != next,
-			"list_add corruption. prev->next should be next (%px), but was %px. (prev=%px).\n",
+			"list_add corruption: prev->next should be next (%px), but was %px (prev=%px)\n",
 			next, prev->next, prev) ||
 	    CHECK_DATA_CORRUPTION(new == prev || new == next,
 			"list_add double add: new=%px, prev=%px, next=%px.\n",
@@ -45,16 +45,16 @@ bool __list_del_entry_valid(struct list_head *entry)
 	next = entry->next;
 
 	if (CHECK_DATA_CORRUPTION(next == LIST_POISON1,
-			"list_del corruption, %px->next is LIST_POISON1 (%px)\n",
+			"list_del corruption: %px->next is LIST_POISON1 (%px)\n",
 			entry, LIST_POISON1) ||
 	    CHECK_DATA_CORRUPTION(prev == LIST_POISON2,
-			"list_del corruption, %px->prev is LIST_POISON2 (%px)\n",
+			"list_del corruption: %px->prev is LIST_POISON2 (%px)\n",
 			entry, LIST_POISON2) ||
 	    CHECK_DATA_CORRUPTION(prev->next != entry,
-			"list_del corruption. prev->next should be %px, but was %px\n",
+			"list_del corruption: prev->next should be %px, but was %px\n",
 			entry, prev->next) ||
 	    CHECK_DATA_CORRUPTION(next->prev != entry,
-			"list_del corruption. next->prev should be %px, but was %px\n",
+			"list_del corruption: next->prev should be %px, but was %px\n",
 			entry, next->prev))
 		return false;
 
@@ -196,7 +196,7 @@ bool __hlist_bl_add_head_valid(struct hlist_bl_node *new,
 	unsigned long nlock = (unsigned long)new & LIST_BL_LOCKMASK;
 
 	if (CHECK_DATA_CORRUPTION(nlock,
-			"hlist_bl_add_head: node is locked\n") ||
+			"hlist_bl_add_head corruption: node is locked\n") ||
 	    CHECK_DATA_CORRUPTION(hlock != LIST_BL_LOCKMASK,
 			"hlist_bl_add_head: head is unlocked\n"))
 		return false;
@@ -222,10 +222,10 @@ bool __hlist_bl_del_valid(struct hlist_bl_node *node)
 	if (CHECK_DATA_CORRUPTION(nlock,
 			"hlist_bl_del corruption: node is locked") ||
 	    CHECK_DATA_CORRUPTION(next == LIST_POISON1,
-			"hlist_bl_del corruption, %px->next is LIST_POISON1 (%px)\n",
+			"hlist_bl_del corruption: %px->next is LIST_POISON1 (%px)\n",
 			node, LIST_POISON1) ||
 	    CHECK_DATA_CORRUPTION(node->pprev == LIST_POISON2,
-			"hlist_bl_del corruption, %px->pprev is LIST_POISON2 (%px)\n",
+			"hlist_bl_del corruption: %px->pprev is LIST_POISON2 (%px)\n",
 			node, LIST_POISON2))
 		return false;
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [RFC PATCH 21/21] lkdtm: Extend list corruption checks
  2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
                   ` (19 preceding siblings ...)
  2020-03-24 15:36 ` [RFC PATCH 20/21] list: Format CHECK_DATA_CORRUPTION error messages consistently Will Deacon
@ 2020-03-24 15:36 ` Will Deacon
  20 siblings, 0 replies; 58+ messages in thread
From: Will Deacon @ 2020-03-24 15:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: Will Deacon, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

Although lkdtm has a couple of simple tests for triggering list
corruption, there are plenty of things it doesn't trigger, particularly
now that we have forms of integrity checking for other list types.

Extend lkdtm to check a variety of list insertion and deletion routines.

Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 drivers/misc/lkdtm/Makefile             |   1 +
 drivers/misc/lkdtm/bugs.c               |  68 ----
 drivers/misc/lkdtm/core.c               |  31 +-
 drivers/misc/lkdtm/list.c               | 489 ++++++++++++++++++++++++
 drivers/misc/lkdtm/lkdtm.h              |  33 +-
 tools/testing/selftests/lkdtm/tests.txt |  31 +-
 6 files changed, 579 insertions(+), 74 deletions(-)
 create mode 100644 drivers/misc/lkdtm/list.c

diff --git a/drivers/misc/lkdtm/Makefile b/drivers/misc/lkdtm/Makefile
index c70b3822013f..833c6f7c78a3 100644
--- a/drivers/misc/lkdtm/Makefile
+++ b/drivers/misc/lkdtm/Makefile
@@ -10,6 +10,7 @@ lkdtm-$(CONFIG_LKDTM)		+= rodata_objcopy.o
 lkdtm-$(CONFIG_LKDTM)		+= usercopy.o
 lkdtm-$(CONFIG_LKDTM)		+= stackleak.o
 lkdtm-$(CONFIG_LKDTM)		+= cfi.o
+lkdtm-$(CONFIG_LKDTM)		+= list.o
 
 KASAN_SANITIZE_stackleak.o	:= n
 KCOV_INSTRUMENT_rodata.o	:= n
diff --git a/drivers/misc/lkdtm/bugs.c b/drivers/misc/lkdtm/bugs.c
index de87693cf557..de0c54c8fac3 100644
--- a/drivers/misc/lkdtm/bugs.c
+++ b/drivers/misc/lkdtm/bugs.c
@@ -6,7 +6,6 @@
  * test source files.
  */
 #include "lkdtm.h"
-#include <linux/list.h>
 #include <linux/sched.h>
 #include <linux/sched/signal.h>
 #include <linux/sched/task_stack.h>
@@ -16,10 +15,6 @@
 #include <asm/desc.h>
 #endif
 
-struct lkdtm_list {
-	struct list_head node;
-};
-
 /*
  * Make sure our attempts to over run the kernel stack doesn't trigger
  * a compiler warning when CONFIG_FRAME_WARN is set. Then make sure we
@@ -175,69 +170,6 @@ void lkdtm_HUNG_TASK(void)
 	schedule();
 }
 
-void lkdtm_CORRUPT_LIST_ADD(void)
-{
-	/*
-	 * Initially, an empty list via LIST_HEAD:
-	 *	test_head.next = &test_head
-	 *	test_head.prev = &test_head
-	 */
-	LIST_HEAD(test_head);
-	struct lkdtm_list good, bad;
-	void *target[2] = { };
-	void *redirection = &target;
-
-	pr_info("attempting good list addition\n");
-
-	/*
-	 * Adding to the list performs these actions:
-	 *	test_head.next->prev = &good.node
-	 *	good.node.next = test_head.next
-	 *	good.node.prev = test_head
-	 *	test_head.next = good.node
-	 */
-	list_add(&good.node, &test_head);
-
-	pr_info("attempting corrupted list addition\n");
-	/*
-	 * In simulating this "write what where" primitive, the "what" is
-	 * the address of &bad.node, and the "where" is the address held
-	 * by "redirection".
-	 */
-	test_head.next = redirection;
-	list_add(&bad.node, &test_head);
-
-	if (target[0] == NULL && target[1] == NULL)
-		pr_err("Overwrite did not happen, but no BUG?!\n");
-	else
-		pr_err("list_add() corruption not detected!\n");
-}
-
-void lkdtm_CORRUPT_LIST_DEL(void)
-{
-	LIST_HEAD(test_head);
-	struct lkdtm_list item;
-	void *target[2] = { };
-	void *redirection = &target;
-
-	list_add(&item.node, &test_head);
-
-	pr_info("attempting good list removal\n");
-	list_del(&item.node);
-
-	pr_info("attempting corrupted list removal\n");
-	list_add(&item.node, &test_head);
-
-	/* As with the list_add() test above, this corrupts "next". */
-	item.node.next = redirection;
-	list_del(&item.node);
-
-	if (target[0] == NULL && target[1] == NULL)
-		pr_err("Overwrite did not happen, but no BUG?!\n");
-	else
-		pr_err("list_del() corruption not detected!\n");
-}
-
 /* Test if unbalanced set_fs(KERNEL_DS)/set_fs(USER_DS) check exists. */
 void lkdtm_CORRUPT_USER_DS(void)
 {
diff --git a/drivers/misc/lkdtm/core.c b/drivers/misc/lkdtm/core.c
index ee0d6e721441..28aace88474f 100644
--- a/drivers/misc/lkdtm/core.c
+++ b/drivers/misc/lkdtm/core.c
@@ -110,8 +110,6 @@ static const struct crashtype crashtypes[] = {
 	CRASHTYPE(EXHAUST_STACK),
 	CRASHTYPE(CORRUPT_STACK),
 	CRASHTYPE(CORRUPT_STACK_STRONG),
-	CRASHTYPE(CORRUPT_LIST_ADD),
-	CRASHTYPE(CORRUPT_LIST_DEL),
 	CRASHTYPE(CORRUPT_USER_DS),
 	CRASHTYPE(STACK_GUARD_PAGE_LEADING),
 	CRASHTYPE(STACK_GUARD_PAGE_TRAILING),
@@ -174,6 +172,35 @@ static const struct crashtype crashtypes[] = {
 #ifdef CONFIG_X86_32
 	CRASHTYPE(DOUBLE_FAULT),
 #endif
+	CRASHTYPE(LIST_ADD_NEXT_CORRUPTION),
+	CRASHTYPE(LIST_ADD_PREV_CORRUPTION),
+	CRASHTYPE(LIST_ADD_TWICE),
+	CRASHTYPE(LIST_DEL_NEXT_CORRUPTION),
+	CRASHTYPE(LIST_DEL_PREV_CORRUPTION),
+	CRASHTYPE(LIST_DEL_TWICE),
+	CRASHTYPE(HLIST_ADD_HEAD_CORRUPTION),
+	CRASHTYPE(HLIST_ADD_HEAD_TWICE),
+	CRASHTYPE(HLIST_ADD_BEFORE_CORRUPTION),
+	CRASHTYPE(HLIST_ADD_BEFORE_TWICE),
+	CRASHTYPE(HLIST_ADD_BEHIND_CORRUPTION),
+	CRASHTYPE(HLIST_ADD_BEHIND_TWICE),
+	CRASHTYPE(HLIST_DEL_PREV_CORRUPTION),
+	CRASHTYPE(HLIST_DEL_NEXT_CORRUPTION),
+	CRASHTYPE(HLIST_DEL_TWICE),
+	CRASHTYPE(HLIST_NULLS_ADD_HEAD_CORRUPTION),
+	CRASHTYPE(HLIST_NULLS_ADD_HEAD_TWICE),
+	CRASHTYPE(HLIST_NULLS_DEL_PREV_CORRUPTION),
+	CRASHTYPE(HLIST_NULLS_DEL_NEXT_CORRUPTION),
+	CRASHTYPE(HLIST_NULLS_DEL_TWICE),
+	CRASHTYPE(HLIST_BL_ADD_HEAD_UNLOCKED),
+	CRASHTYPE(HLIST_BL_ADD_HEAD_NODE_LOCKED),
+	CRASHTYPE(HLIST_BL_ADD_HEAD_CORRUPTION),
+	CRASHTYPE(HLIST_BL_ADD_HEAD_TWICE),
+	CRASHTYPE(HLIST_BL_DEL_NODE_LOCKED),
+	CRASHTYPE(HLIST_BL_DEL_NEXT_LOCKED),
+	CRASHTYPE(HLIST_BL_DEL_PREV_CORRUPTION),
+	CRASHTYPE(HLIST_BL_DEL_NEXT_CORRUPTION),
+	CRASHTYPE(HLIST_BL_DEL_TWICE),
 };
 
 
diff --git a/drivers/misc/lkdtm/list.c b/drivers/misc/lkdtm/list.c
new file mode 100644
index 000000000000..e62a7de51eac
--- /dev/null
+++ b/drivers/misc/lkdtm/list.c
@@ -0,0 +1,489 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * This is for all the tests related to list integrity checking.
+ */
+
+#include "lkdtm.h"
+#include <linux/list.h>
+
+#define LIST_REDIR_BUF(type, name)				\
+union {								\
+	type	list;						\
+	char	buf[sizeof(type)];				\
+} name = { }
+
+static void __check_list_redir_buf(const char *str, char *buf, size_t sz)
+{
+	for (; sz && !buf[sz - 1]; sz--);
+	if (sz)
+		pr_err("%s: corruption not detected!\n", str);
+	else
+		pr_err("%s: overwrite did not happen, but no BUG?!\n", str);
+}
+
+#define check_list_redir_buf(s, b)	\
+	__check_list_redir_buf((s), (b).buf, sizeof(b.buf))
+
+void lkdtm_LIST_ADD_NEXT_CORRUPTION(void)
+{
+	LIST_HEAD(head);
+	LIST_HEAD(mid);
+	LIST_HEAD(tail);
+	LIST_REDIR_BUF(struct list_head, target);
+
+	list_add(&tail, &head);
+	head.next = &target.list;
+	list_add(&mid, &head);
+	check_list_redir_buf("list_add()", target);
+}
+
+void lkdtm_LIST_ADD_PREV_CORRUPTION(void)
+{
+	LIST_HEAD(head);
+	LIST_HEAD(mid);
+	LIST_HEAD(tail);
+	LIST_REDIR_BUF(struct list_head, target);
+
+	list_add(&tail, &head);
+	tail.prev = &target.list;
+	list_add_tail(&mid, &tail);
+	check_list_redir_buf("list_add()", target);
+}
+
+void lkdtm_LIST_ADD_TWICE(void)
+{
+	LIST_HEAD(head);
+	LIST_HEAD(mid);
+	LIST_HEAD(tail);
+
+	list_add(&tail, &head);
+	mid = tail;
+	list_add(&tail, &head);
+
+	if (mid.prev != tail.prev || mid.next != tail.next)
+		pr_err("list_add(): adding twice not detected!\n");
+	else
+		pr_err("list_add(): could not add twice, but no BUG?!\n");
+}
+
+void lkdtm_LIST_DEL_NEXT_CORRUPTION(void)
+{
+	LIST_HEAD(head);
+	LIST_HEAD(mid);
+	LIST_HEAD(tail);
+	LIST_REDIR_BUF(struct list_head, target);
+
+	list_add(&tail, &head);
+	list_add(&mid, &head);
+	mid.next = &target.list;
+	list_del(&mid);
+	check_list_redir_buf("list_del()", target);
+}
+
+void lkdtm_LIST_DEL_PREV_CORRUPTION(void)
+{
+	LIST_HEAD(head);
+	LIST_HEAD(mid);
+	LIST_HEAD(tail);
+	LIST_REDIR_BUF(struct list_head, target);
+
+	list_add(&tail, &head);
+	list_add(&mid, &head);
+	mid.prev = &target.list;
+	list_del(&mid);
+	check_list_redir_buf("list_del()", target);
+}
+
+void lkdtm_LIST_DEL_TWICE(void)
+{
+	LIST_HEAD(head);
+	LIST_HEAD(tail);
+
+	list_add(&tail, &head);
+	list_del(&tail);
+
+	if (tail.prev != LIST_POISON2 || tail.next != LIST_POISON1) {
+		pr_err("list_del(): prev/next pointers not poisoned!\n");
+	} else {
+		list_del(&tail);
+		pr_err("list_del(): could not delete twice, but no BUG?!\n");
+	}
+}
+
+void lkdtm_HLIST_ADD_HEAD_CORRUPTION(void)
+{
+	HLIST_HEAD(head);
+	LIST_REDIR_BUF(struct hlist_node, target);
+	struct hlist_node mid, tail;
+
+	INIT_HLIST_NODE(&mid);
+	INIT_HLIST_NODE(&tail);
+	hlist_add_head(&tail, &head);
+	head.first = &target.list;
+	hlist_add_head(&mid, &head);
+	check_list_redir_buf("hlist_add_head()", target);
+}
+
+void lkdtm_HLIST_ADD_HEAD_TWICE(void)
+{
+	HLIST_HEAD(head);
+	struct hlist_node mid, tail;
+
+	INIT_HLIST_NODE(&mid);
+	INIT_HLIST_NODE(&tail);
+	hlist_add_head(&tail, &head);
+	mid = tail;
+	hlist_add_head(&tail, &head);
+
+	if (mid.next != tail.next || mid.pprev != tail.pprev)
+		pr_err("hlist_add_head(): adding twice not detected!\n");
+	else
+		pr_err("hlist_add_head(): could not add twice, but no BUG?!\n");
+}
+
+void lkdtm_HLIST_ADD_BEFORE_CORRUPTION(void)
+{
+	HLIST_HEAD(head);
+	LIST_REDIR_BUF(struct hlist_node, target);
+	struct hlist_node mid, tail;
+
+	INIT_HLIST_NODE(&mid);
+	INIT_HLIST_NODE(&tail);
+	hlist_add_head(&tail, &head);
+	tail.pprev = &target.list.next;
+	hlist_add_before(&mid, &tail);
+	check_list_redir_buf("hlist_add_before()", target);
+}
+
+void lkdtm_HLIST_ADD_BEFORE_TWICE(void)
+{
+	HLIST_HEAD(head);
+	struct hlist_node mid, tail;
+
+	INIT_HLIST_NODE(&mid);
+	INIT_HLIST_NODE(&tail);
+	hlist_add_head(&tail, &head);
+	mid = tail;
+	hlist_add_before(&tail, &tail);
+
+	if (mid.next != tail.next || mid.pprev != tail.pprev)
+		pr_err("hlist_add_before(): adding twice not detected!\n");
+	else
+		pr_err("hlist_add_before(): could not add twice, but no BUG?!\n");
+}
+
+void lkdtm_HLIST_ADD_BEHIND_CORRUPTION(void)
+{
+	HLIST_HEAD(head);
+	LIST_REDIR_BUF(struct hlist_node, target);
+	struct hlist_node mid, tail;
+
+	INIT_HLIST_NODE(&mid);
+	INIT_HLIST_NODE(&tail);
+	hlist_add_head(&mid, &head);
+	mid.next = &target.list;
+	hlist_add_behind(&tail, &mid);
+	check_list_redir_buf("hlist_add_behind()", target);
+}
+
+void lkdtm_HLIST_ADD_BEHIND_TWICE(void)
+{
+	HLIST_HEAD(head);
+	struct hlist_node mid, tail;
+
+	INIT_HLIST_NODE(&mid);
+	INIT_HLIST_NODE(&tail);
+	hlist_add_head(&tail, &head);
+	mid = tail;
+	hlist_add_behind(&tail, &tail);
+
+	if (mid.next != tail.next || mid.pprev != tail.pprev)
+		pr_err("hlist_add_behind(): adding twice not detected!\n");
+	else
+		pr_err("hlist_add_behind(): could not add twice, but no BUG?!\n");
+}
+
+void lkdtm_HLIST_DEL_PREV_CORRUPTION(void)
+{
+	HLIST_HEAD(head);
+	LIST_REDIR_BUF(struct hlist_node, target);
+	struct hlist_node mid, tail;
+
+	INIT_HLIST_NODE(&mid);
+	INIT_HLIST_NODE(&tail);
+	hlist_add_head(&tail, &head);
+	hlist_add_head(&mid, &head);
+	mid.pprev = &target.list.next;
+	hlist_del(&mid);
+	check_list_redir_buf("hlist_del()", target);
+}
+
+void lkdtm_HLIST_DEL_NEXT_CORRUPTION(void)
+{
+	HLIST_HEAD(head);
+	LIST_REDIR_BUF(struct hlist_node, target);
+	struct hlist_node mid, tail;
+
+	INIT_HLIST_NODE(&mid);
+	INIT_HLIST_NODE(&tail);
+	hlist_add_head(&tail, &head);
+	hlist_add_head(&mid, &head);
+	mid.next = &target.list;
+	hlist_del(&mid);
+	check_list_redir_buf("hlist_del()", target);
+}
+
+void lkdtm_HLIST_DEL_TWICE(void)
+{
+	HLIST_HEAD(head);
+	struct hlist_node tail;
+
+	INIT_HLIST_NODE(&tail);
+	hlist_add_head(&tail, &head);
+	hlist_del(&tail);
+
+	if (tail.next != LIST_POISON1 || tail.pprev != LIST_POISON2) {
+		pr_err("hlist_del(): pprev/next pointers not poisoned!\n");
+	} else {
+		hlist_del(&tail);
+		pr_err("hlist_del(): could not delete twice, but no BUG?!\n");
+	}
+}
+
+#include <linux/list_nulls.h>
+
+void lkdtm_HLIST_NULLS_ADD_HEAD_CORRUPTION(void)
+{
+	LIST_REDIR_BUF(struct hlist_nulls_node, target);
+	struct hlist_nulls_head head;
+	struct hlist_nulls_node mid, tail;
+
+	INIT_HLIST_NULLS_HEAD(&head, 0);
+	hlist_nulls_add_head(&tail, &head);
+	head.first = &target.list;
+	hlist_nulls_add_head(&mid, &head);
+	check_list_redir_buf("hlist_nulls_add_head()", target);
+}
+
+void lkdtm_HLIST_NULLS_ADD_HEAD_TWICE(void)
+{
+	struct hlist_nulls_head head;
+	struct hlist_nulls_node mid, tail;
+
+	INIT_HLIST_NULLS_HEAD(&head, 0);
+	hlist_nulls_add_head(&tail, &head);
+	mid = tail;
+	hlist_nulls_add_head(&tail, &head);
+
+	if (mid.next != tail.next || mid.pprev != tail.pprev)
+		pr_err("hlist_nulls_add_head(): adding twice not detected!\n");
+	else
+		pr_err("hlist_nulls_add_head(): could not add twice, but no BUG?!\n");
+}
+
+void lkdtm_HLIST_NULLS_DEL_PREV_CORRUPTION(void)
+{
+	LIST_REDIR_BUF(struct hlist_nulls_node, target);
+	struct hlist_nulls_head head;
+	struct hlist_nulls_node mid, tail;
+
+	INIT_HLIST_NULLS_HEAD(&head, 0);
+	hlist_nulls_add_head(&tail, &head);
+	hlist_nulls_add_head(&mid, &head);
+	mid.pprev = &target.list.next;
+	hlist_nulls_del(&mid);
+	check_list_redir_buf("hlist_nulls_del()", target);
+}
+
+void lkdtm_HLIST_NULLS_DEL_NEXT_CORRUPTION(void)
+{
+	LIST_REDIR_BUF(struct hlist_nulls_node, target);
+	struct hlist_nulls_head head;
+	struct hlist_nulls_node mid, tail;
+
+	INIT_HLIST_NULLS_HEAD(&head, 0);
+	hlist_nulls_add_head(&tail, &head);
+	hlist_nulls_add_head(&mid, &head);
+	mid.next = &target.list;
+	hlist_nulls_del(&mid);
+	check_list_redir_buf("hlist_nulls_del()", target);
+}
+
+void lkdtm_HLIST_NULLS_DEL_TWICE(void)
+{
+	struct hlist_nulls_head head;
+	struct hlist_nulls_node tail;
+
+	INIT_HLIST_NULLS_HEAD(&head, 0);
+	hlist_nulls_add_head(&tail, &head);
+	hlist_nulls_del(&tail);
+
+	if (tail.next != LIST_POISON1 || tail.pprev != LIST_POISON2) {
+		pr_err("hlist_nulls_del(): pprev/next pointers not poisoned!\n");
+	} else {
+		hlist_nulls_del(&tail);
+		pr_err("hlist_nulls_del(): could not delete twice, but no BUG?!\n");
+	}
+}
+
+#include <linux/list_bl.h>
+
+void lkdtm_HLIST_BL_ADD_HEAD_UNLOCKED(void)
+{
+	LIST_REDIR_BUF(struct hlist_bl_node, target);
+	struct hlist_bl_head head;
+
+	INIT_HLIST_BL_HEAD(&head);
+	hlist_bl_add_head(&target.list, &head);
+	check_list_redir_buf("hlist_bl_add()", target);
+}
+
+void lkdtm_HLIST_BL_ADD_HEAD_NODE_LOCKED(void)
+{
+	LIST_REDIR_BUF(struct hlist_bl_node, target);
+	struct hlist_bl_head head;
+	unsigned long nval;
+
+	INIT_HLIST_BL_HEAD(&head);
+	hlist_bl_lock(&head);
+	nval = (unsigned long)&target.list | LIST_BL_LOCKMASK;
+	hlist_bl_add_head((struct hlist_bl_node *)nval, &head);
+	hlist_bl_unlock(&head);
+
+	check_list_redir_buf("hlist_bl_add()", target);
+}
+
+void lkdtm_HLIST_BL_ADD_HEAD_CORRUPTION(void)
+{
+	LIST_REDIR_BUF(struct hlist_bl_node, target);
+	struct hlist_bl_head head;
+	struct hlist_bl_node mid, tail;
+	unsigned long nval;
+
+	INIT_HLIST_BL_HEAD(&head);
+	hlist_bl_lock(&head);
+	hlist_bl_add_head(&tail, &head);
+	nval = (unsigned long)&target.list | LIST_BL_LOCKMASK;
+	head.first = (struct hlist_bl_node *)nval;
+	hlist_bl_add_head(&mid, &head);
+	hlist_bl_unlock(&head);
+
+	check_list_redir_buf("hlist_bl_add_head()", target);
+}
+
+void lkdtm_HLIST_BL_ADD_HEAD_TWICE(void)
+{
+	struct hlist_bl_head head;
+	struct hlist_bl_node mid, tail;
+
+	INIT_HLIST_BL_HEAD(&head);
+	hlist_bl_lock(&head);
+	hlist_bl_add_head(&tail, &head);
+	mid = tail;
+	hlist_bl_add_head(&tail, &head);
+	hlist_bl_unlock(&head);
+
+	if (mid.next != tail.next || mid.pprev != tail.pprev)
+		pr_err("hlist_bl_add_head(): adding twice not detected!\n");
+	else
+		pr_err("hlist_bl_add_head(): could not add twice, but no BUG?!\n");
+}
+
+void lkdtm_HLIST_BL_DEL_NODE_LOCKED(void)
+{
+	struct hlist_bl_head head;
+	struct hlist_bl_node mid, tail;
+	unsigned long nval;
+
+	INIT_HLIST_BL_HEAD(&head);
+	hlist_bl_lock(&head);
+	hlist_bl_add_head(&tail, &head);
+	hlist_bl_add_head(&mid, &head);
+	hlist_bl_unlock(&head);
+
+	nval = (unsigned long)&mid | LIST_BL_LOCKMASK;
+	/* hlist_bl_del() poisons ->next, so don't use it with a locked node */
+	__hlist_bl_del((struct hlist_bl_node *)nval);
+
+	if (head.first != &mid || tail.pprev != &mid.next)
+		pr_err("hlist_bl_del(): deleting locked node not detected!\n");
+	else
+		pr_err("hlist_bl_del(): could not delete locked node, but no BUG?!\n");
+}
+
+void lkdtm_HLIST_BL_DEL_NEXT_LOCKED(void)
+{
+	struct hlist_bl_head head;
+	struct hlist_bl_node mid, tail;
+	unsigned long nval;
+
+	INIT_HLIST_BL_HEAD(&head);
+	hlist_bl_lock(&head);
+	hlist_bl_add_head(&tail, &head);
+	hlist_bl_add_head(&mid, &head);
+	hlist_bl_unlock(&head);
+
+	nval = (unsigned long)mid.next | LIST_BL_LOCKMASK;
+	mid.next = (struct hlist_bl_node *)nval;
+	hlist_bl_del(&mid);
+
+	if (head.first != &mid || tail.pprev != &mid.next)
+		pr_err("hlist_bl_del(): deleting node with locked ->next not detected!\n");
+	else
+		pr_err("hlist_bl_del(): could not delete node with locked ->next, but no BUG?!\n");
+}
+
+void lkdtm_HLIST_BL_DEL_PREV_CORRUPTION(void)
+{
+	LIST_REDIR_BUF(struct hlist_bl_node, target);
+	struct hlist_bl_head head;
+	struct hlist_bl_node mid, tail;
+
+	INIT_HLIST_BL_HEAD(&head);
+	hlist_bl_lock(&head);
+	hlist_bl_add_head(&tail, &head);
+	hlist_bl_add_head(&mid, &head);
+	hlist_bl_unlock(&head);
+
+	mid.pprev = &target.list.next;
+	hlist_bl_del(&mid);
+	check_list_redir_buf("hlist_bl_del()", target);
+}
+
+void lkdtm_HLIST_BL_DEL_NEXT_CORRUPTION(void)
+{
+	LIST_REDIR_BUF(struct hlist_bl_node, target);
+	struct hlist_bl_head head;
+	struct hlist_bl_node mid, tail;
+
+	INIT_HLIST_BL_HEAD(&head);
+	hlist_bl_lock(&head);
+	hlist_bl_add_head(&tail, &head);
+	hlist_bl_add_head(&mid, &head);
+	hlist_bl_unlock(&head);
+
+	mid.next = &target.list;
+	hlist_bl_del(&mid);
+	check_list_redir_buf("hlist_bl_del()", target);
+}
+
+void lkdtm_HLIST_BL_DEL_TWICE(void)
+{
+	struct hlist_bl_head head;
+	struct hlist_bl_node tail;
+
+	INIT_HLIST_BL_HEAD(&head);
+	hlist_bl_lock(&head);
+	hlist_bl_add_head(&tail, &head);
+	hlist_bl_unlock(&head);
+
+	hlist_bl_del(&tail);
+
+	if (tail.next != LIST_POISON1 || tail.pprev != LIST_POISON2) {
+		pr_err("hlist_bl_del(): pprev/next pointers not poisoned!\n");
+	} else {
+		hlist_bl_del(&tail);
+		pr_err("hlist_bl_del(): could not delete twice, but no BUG?!\n");
+	}
+}
diff --git a/drivers/misc/lkdtm/lkdtm.h b/drivers/misc/lkdtm/lkdtm.h
index c56d23e37643..b5acce183473 100644
--- a/drivers/misc/lkdtm/lkdtm.h
+++ b/drivers/misc/lkdtm/lkdtm.h
@@ -22,8 +22,6 @@ void lkdtm_SOFTLOCKUP(void);
 void lkdtm_HARDLOCKUP(void);
 void lkdtm_SPINLOCKUP(void);
 void lkdtm_HUNG_TASK(void);
-void lkdtm_CORRUPT_LIST_ADD(void);
-void lkdtm_CORRUPT_LIST_DEL(void);
 void lkdtm_CORRUPT_USER_DS(void);
 void lkdtm_STACK_GUARD_PAGE_LEADING(void);
 void lkdtm_STACK_GUARD_PAGE_TRAILING(void);
@@ -102,4 +100,35 @@ void lkdtm_STACKLEAK_ERASING(void);
 /* cfi.c */
 void lkdtm_CFI_FORWARD_PROTO(void);
 
+/* list.c */
+void lkdtm_LIST_ADD_NEXT_CORRUPTION(void);
+void lkdtm_LIST_ADD_PREV_CORRUPTION(void);
+void lkdtm_LIST_ADD_TWICE(void);
+void lkdtm_LIST_DEL_NEXT_CORRUPTION(void);
+void lkdtm_LIST_DEL_PREV_CORRUPTION(void);
+void lkdtm_LIST_DEL_TWICE(void);
+void lkdtm_HLIST_ADD_HEAD_CORRUPTION(void);
+void lkdtm_HLIST_ADD_HEAD_TWICE(void);
+void lkdtm_HLIST_ADD_BEFORE_CORRUPTION(void);
+void lkdtm_HLIST_ADD_BEFORE_TWICE(void);
+void lkdtm_HLIST_ADD_BEHIND_CORRUPTION(void);
+void lkdtm_HLIST_ADD_BEHIND_TWICE(void);
+void lkdtm_HLIST_DEL_PREV_CORRUPTION(void);
+void lkdtm_HLIST_DEL_NEXT_CORRUPTION(void);
+void lkdtm_HLIST_DEL_TWICE(void);
+void lkdtm_HLIST_NULLS_ADD_HEAD_CORRUPTION(void);
+void lkdtm_HLIST_NULLS_ADD_HEAD_TWICE(void);
+void lkdtm_HLIST_NULLS_DEL_PREV_CORRUPTION(void);
+void lkdtm_HLIST_NULLS_DEL_NEXT_CORRUPTION(void);
+void lkdtm_HLIST_NULLS_DEL_TWICE(void);
+void lkdtm_HLIST_BL_ADD_HEAD_UNLOCKED(void);
+void lkdtm_HLIST_BL_ADD_HEAD_NODE_LOCKED(void);
+void lkdtm_HLIST_BL_ADD_HEAD_CORRUPTION(void);
+void lkdtm_HLIST_BL_ADD_HEAD_TWICE(void);
+void lkdtm_HLIST_BL_DEL_NODE_LOCKED(void);
+void lkdtm_HLIST_BL_DEL_NEXT_LOCKED(void);
+void lkdtm_HLIST_BL_DEL_PREV_CORRUPTION(void);
+void lkdtm_HLIST_BL_DEL_NEXT_CORRUPTION(void);
+void lkdtm_HLIST_BL_DEL_TWICE(void);
+
 #endif
diff --git a/tools/testing/selftests/lkdtm/tests.txt b/tools/testing/selftests/lkdtm/tests.txt
index 92ca32143ae5..93901506a72f 100644
--- a/tools/testing/selftests/lkdtm/tests.txt
+++ b/tools/testing/selftests/lkdtm/tests.txt
@@ -7,8 +7,6 @@ EXCEPTION
 #EXHAUST_STACK Corrupts memory on failure
 #CORRUPT_STACK Crashes entire system on success
 #CORRUPT_STACK_STRONG Crashes entire system on success
-CORRUPT_LIST_ADD list_add corruption
-CORRUPT_LIST_DEL list_del corruption
 CORRUPT_USER_DS Invalid address limit on user-mode return
 STACK_GUARD_PAGE_LEADING
 STACK_GUARD_PAGE_TRAILING
@@ -69,3 +67,32 @@ USERCOPY_KERNEL
 USERCOPY_KERNEL_DS
 STACKLEAK_ERASING OK: the rest of the thread stack is properly erased
 CFI_FORWARD_PROTO
+LIST_ADD_NEXT_CORRUPTION list_add corruption: next->prev should be prev
+LIST_ADD_PREV_CORRUPTION list_add corruption: prev->next should be next
+LIST_ADD_TWICE list_add double add:
+LIST_DEL_NEXT_CORRUPTION list_del corruption: next->prev should be
+LIST_DEL_PREV_CORRUPTION list_del corruption: prev->next should be
+LIST_DEL_TWICE list_del corruption:
+HLIST_ADD_HEAD_CORRUPTION hlist_add_head corruption: first->pprev should be &head->first
+HLIST_ADD_HEAD_TWICE hlist_add_head double add:
+HLIST_ADD_BEFORE_CORRUPTION hlist_add corruption: prev->next should be next
+HLIST_ADD_BEFORE_TWICE hlist_add double add:
+HLIST_ADD_BEHIND_CORRUPTION hlist_add corruption: next->pprev should be &prev->next
+HLIST_ADD_BEHIND_TWICE hlist_add double add:
+HLIST_DEL_PREV_CORRUPTION hlist_del corruption: prev->next should be
+HLIST_DEL_NEXT_CORRUPTION hlist_del corruption: next->pprev should be
+HLIST_DEL_TWICE hlist_del corruption:
+HLIST_NULLS_ADD_HEAD_CORRUPTION hlist_nulls_add_head corruption: first->pprev should be &head->first
+HLIST_NULLS_ADD_HEAD_TWICE hlist_nulls_add_head double add:
+HLIST_NULLS_DEL_PREV_CORRUPTION hlist_nulls_del corruption: prev->next should be
+HLIST_NULLS_DEL_NEXT_CORRUPTION hlist_nulls_del corruption: next->pprev should be
+HLIST_NULLS_DEL_TWICE hlist_nulls_del corruption:
+HLIST_BL_ADD_HEAD_UNLOCKED hlist_bl_add_head: head is unlocked
+#HLIST_BL_ADD_HEAD_NODE_LOCKED Corrupts memory on failure
+HLIST_BL_ADD_HEAD_CORRUPTION hlist_bl_add_head corruption: first->pprev should be &head->first
+HLIST_BL_ADD_HEAD_TWICE hlist_bl_add_head double add:
+#HLIST_BL_DEL_NODE_LOCKED Corrupts memory on failure
+#HLIST_BL_DEL_NEXT_LOCKED Corrupts memory on failure
+HLIST_BL_DEL_PREV_CORRUPTION hlist_bl_del corruption: prev->next should be
+HLIST_BL_DEL_NEXT_CORRUPTION hlist_bl_del corruption: next->pprev should be
+HLIST_BL_DEL_TWICE hlist_bl_del corruption:
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race()
  2020-03-24 15:36 ` [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race() Will Deacon
@ 2020-03-24 16:20   ` Jann Horn
  2020-03-24 16:26     ` Greg KH
  2020-03-24 16:23   ` Marco Elver
  2020-03-24 16:51   ` [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race() Peter Zijlstra
  2 siblings, 1 reply; 58+ messages in thread
From: Jann Horn @ 2020-03-24 16:20 UTC (permalink / raw)
  To: Will Deacon
  Cc: kernel list, Eric Dumazet, Kees Cook, Maddie Stone, Marco Elver,
	Paul E . McKenney, Peter Zijlstra, Thomas Gleixner, kernel-team,
	Kernel Hardening

On Tue, Mar 24, 2020 at 4:37 PM Will Deacon <will@kernel.org> wrote:
> Some list predicates can be used locklessly even with the non-RCU list
> implementations, since they effectively boil down to a test against
> NULL. For example, checking whether or not a list is empty is safe even
> in the presence of a concurrent, tearing write to the list head pointer.
> Similarly, checking whether or not an hlist node has been hashed is safe
> as well.
>
> Annotate these lockless list predicates with data_race() and READ_ONCE()
> so that KCSAN and the compiler are aware of what's going on. The writer
> side can then avoid having to use WRITE_ONCE() in the non-RCU
> implementation.
[...]
>  static inline int list_empty(const struct list_head *head)
>  {
> -       return READ_ONCE(head->next) == head;
> +       return data_race(READ_ONCE(head->next) == head);
>  }
[...]
>  static inline int hlist_unhashed(const struct hlist_node *h)
>  {
> -       return !READ_ONCE(h->pprev);
> +       return data_race(!READ_ONCE(h->pprev));
>  }

This is probably valid in practice for hlist_unhashed(), which
compares with NULL, as long as the most significant byte of all kernel
pointers is non-zero; but I think list_empty() could realistically
return false positives in the presence of a concurrent tearing store?
This could break the following code pattern:

/* optimistic lockless check */
if (!list_empty(&some_list)) {
  /* slowpath */
  mutex_lock(&some_mutex);
  list_for_each(tmp, &some_list) {
    ...
  }
  mutex_unlock(&some_mutex);
}

(I'm not sure whether patterns like this appear commonly though.)

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race()
  2020-03-24 15:36 ` [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race() Will Deacon
  2020-03-24 16:20   ` Jann Horn
@ 2020-03-24 16:23   ` Marco Elver
  2020-03-24 21:33     ` Will Deacon
  2020-03-31 13:10     ` Will Deacon
  2020-03-24 16:51   ` [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race() Peter Zijlstra
  2 siblings, 2 replies; 58+ messages in thread
From: Marco Elver @ 2020-03-24 16:23 UTC (permalink / raw)
  To: Will Deacon
  Cc: LKML, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Paul E . McKenney, Peter Zijlstra, Thomas Gleixner, kernel-team,
	kernel-hardening

On Tue, 24 Mar 2020 at 16:37, Will Deacon <will@kernel.org> wrote:
>
> Some list predicates can be used locklessly even with the non-RCU list
> implementations, since they effectively boil down to a test against
> NULL. For example, checking whether or not a list is empty is safe even
> in the presence of a concurrent, tearing write to the list head pointer.
> Similarly, checking whether or not an hlist node has been hashed is safe
> as well.
>
> Annotate these lockless list predicates with data_race() and READ_ONCE()
> so that KCSAN and the compiler are aware of what's going on. The writer
> side can then avoid having to use WRITE_ONCE() in the non-RCU
> implementation.
>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Marco Elver <elver@google.com>
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  include/linux/list.h       | 10 +++++-----
>  include/linux/list_bl.h    |  5 +++--
>  include/linux/list_nulls.h |  6 +++---
>  include/linux/llist.h      |  2 +-
>  4 files changed, 12 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/list.h b/include/linux/list.h
> index 4fed5a0f9b77..4d9f5f9ed1a8 100644
> --- a/include/linux/list.h
> +++ b/include/linux/list.h
> @@ -279,7 +279,7 @@ static inline int list_is_last(const struct list_head *list,
>   */
>  static inline int list_empty(const struct list_head *head)
>  {
> -       return READ_ONCE(head->next) == head;
> +       return data_race(READ_ONCE(head->next) == head);

Double-marking should never be necessary, at least if you want to make
KCSAN happy. From what I gather there is an unmarked write somewhere,
correct? In that case, KCSAN will still complain because if it sees a
race between this read and the other write, then at least one is still
plain (the write).

Then, my suggestion would be to mark the write with data_race() and
just leave this as a READ_ONCE(). Having a data_race() somewhere only
makes KCSAN stop reporting the race if the paired access is also
marked (be it with data_race() or _ONCE, etc.).

Alternatively, if marking the write is impossible, you can surround
the access with kcsan_disable_current()/kcsan_enable_current(). Or, as
a last resort, just leaving as-is is fine too, because KCSAN's default
config (still) has KCSAN_ASSUME_PLAIN_WRITES_ATOMIC selected.

Thanks,
-- Marco

>  }
>
>  /**
> @@ -524,7 +524,7 @@ static inline void list_splice_tail_init(struct list_head *list,
>   */
>  #define list_first_entry_or_null(ptr, type, member) ({ \
>         struct list_head *head__ = (ptr); \
> -       struct list_head *pos__ = READ_ONCE(head__->next); \
> +       struct list_head *pos__ = data_race(READ_ONCE(head__->next)); \
>         pos__ != head__ ? list_entry(pos__, type, member) : NULL; \
>  })
>
> @@ -772,13 +772,13 @@ static inline void INIT_HLIST_NODE(struct hlist_node *h)
>   * hlist_unhashed - Has node been removed from list and reinitialized?
>   * @h: Node to be checked
>   *
> - * Not that not all removal functions will leave a node in unhashed
> + * Note that not all removal functions will leave a node in unhashed
>   * state.  For example, hlist_nulls_del_init_rcu() does leave the
>   * node in unhashed state, but hlist_nulls_del() does not.
>   */
>  static inline int hlist_unhashed(const struct hlist_node *h)
>  {
> -       return !READ_ONCE(h->pprev);
> +       return data_race(!READ_ONCE(h->pprev));
>  }
>
>  /**
> @@ -787,7 +787,7 @@ static inline int hlist_unhashed(const struct hlist_node *h)
>   */
>  static inline int hlist_empty(const struct hlist_head *h)
>  {
> -       return !READ_ONCE(h->first);
> +       return data_race(!READ_ONCE(h->first));
>  }
>
>  static inline void __hlist_del(struct hlist_node *n)
> diff --git a/include/linux/list_bl.h b/include/linux/list_bl.h
> index ae1b541446c9..1804fdb84dda 100644
> --- a/include/linux/list_bl.h
> +++ b/include/linux/list_bl.h
> @@ -51,7 +51,7 @@ static inline void INIT_HLIST_BL_NODE(struct hlist_bl_node *h)
>
>  static inline bool  hlist_bl_unhashed(const struct hlist_bl_node *h)
>  {
> -       return !h->pprev;
> +       return data_race(!READ_ONCE(h->pprev));
>  }
>
>  static inline struct hlist_bl_node *hlist_bl_first(struct hlist_bl_head *h)
> @@ -71,7 +71,8 @@ static inline void hlist_bl_set_first(struct hlist_bl_head *h,
>
>  static inline bool hlist_bl_empty(const struct hlist_bl_head *h)
>  {
> -       return !((unsigned long)READ_ONCE(h->first) & ~LIST_BL_LOCKMASK);
> +       unsigned long first = data_race((unsigned long)READ_ONCE(h->first));
> +       return !(first & ~LIST_BL_LOCKMASK);
>  }
>
>  static inline void hlist_bl_add_head(struct hlist_bl_node *n,
> diff --git a/include/linux/list_nulls.h b/include/linux/list_nulls.h
> index 3a9ff01e9a11..fa51a801bf32 100644
> --- a/include/linux/list_nulls.h
> +++ b/include/linux/list_nulls.h
> @@ -60,18 +60,18 @@ static inline unsigned long get_nulls_value(const struct hlist_nulls_node *ptr)
>   * hlist_nulls_unhashed - Has node been removed and reinitialized?
>   * @h: Node to be checked
>   *
> - * Not that not all removal functions will leave a node in unhashed state.
> + * Note that not all removal functions will leave a node in unhashed state.
>   * For example, hlist_del_init_rcu() leaves the node in unhashed state,
>   * but hlist_nulls_del() does not.
>   */
>  static inline int hlist_nulls_unhashed(const struct hlist_nulls_node *h)
>  {
> -       return !READ_ONCE(h->pprev);
> +       return data_race(!READ_ONCE(h->pprev));
>  }
>
>  static inline int hlist_nulls_empty(const struct hlist_nulls_head *h)
>  {
> -       return is_a_nulls(READ_ONCE(h->first));
> +       return data_race(is_a_nulls(READ_ONCE(h->first)));
>  }
>
>  static inline void hlist_nulls_add_head(struct hlist_nulls_node *n,
> diff --git a/include/linux/llist.h b/include/linux/llist.h
> index 2e9c7215882b..c7f042b73899 100644
> --- a/include/linux/llist.h
> +++ b/include/linux/llist.h
> @@ -186,7 +186,7 @@ static inline void init_llist_head(struct llist_head *list)
>   */
>  static inline bool llist_empty(const struct llist_head *head)
>  {
> -       return READ_ONCE(head->first) == NULL;
> +       return data_race(READ_ONCE(head->first) == NULL);
>  }
>
>  static inline struct llist_node *llist_next(struct llist_node *node)
> --
> 2.20.1
>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race()
  2020-03-24 16:20   ` Jann Horn
@ 2020-03-24 16:26     ` Greg KH
  2020-03-24 16:38       ` Jann Horn
  0 siblings, 1 reply; 58+ messages in thread
From: Greg KH @ 2020-03-24 16:26 UTC (permalink / raw)
  To: Jann Horn
  Cc: Will Deacon, kernel list, Eric Dumazet, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, Kernel Hardening

On Tue, Mar 24, 2020 at 05:20:45PM +0100, Jann Horn wrote:
> On Tue, Mar 24, 2020 at 4:37 PM Will Deacon <will@kernel.org> wrote:
> > Some list predicates can be used locklessly even with the non-RCU list
> > implementations, since they effectively boil down to a test against
> > NULL. For example, checking whether or not a list is empty is safe even
> > in the presence of a concurrent, tearing write to the list head pointer.
> > Similarly, checking whether or not an hlist node has been hashed is safe
> > as well.
> >
> > Annotate these lockless list predicates with data_race() and READ_ONCE()
> > so that KCSAN and the compiler are aware of what's going on. The writer
> > side can then avoid having to use WRITE_ONCE() in the non-RCU
> > implementation.
> [...]
> >  static inline int list_empty(const struct list_head *head)
> >  {
> > -       return READ_ONCE(head->next) == head;
> > +       return data_race(READ_ONCE(head->next) == head);
> >  }
> [...]
> >  static inline int hlist_unhashed(const struct hlist_node *h)
> >  {
> > -       return !READ_ONCE(h->pprev);
> > +       return data_race(!READ_ONCE(h->pprev));
> >  }
> 
> This is probably valid in practice for hlist_unhashed(), which
> compares with NULL, as long as the most significant byte of all kernel
> pointers is non-zero; but I think list_empty() could realistically
> return false positives in the presence of a concurrent tearing store?
> This could break the following code pattern:
> 
> /* optimistic lockless check */
> if (!list_empty(&some_list)) {
>   /* slowpath */
>   mutex_lock(&some_mutex);
>   list_for_each(tmp, &some_list) {
>     ...
>   }
>   mutex_unlock(&some_mutex);
> }
> 
> (I'm not sure whether patterns like this appear commonly though.)


I would hope not as the list could go "empty" before the lock is
grabbed.  That pattern would be wrong.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 01/21] list: Remove hlist_unhashed_lockless()
  2020-03-24 15:36 ` [RFC PATCH 01/21] list: Remove hlist_unhashed_lockless() Will Deacon
@ 2020-03-24 16:27   ` Greg KH
  2020-03-30 23:05   ` Paul E. McKenney
  1 sibling, 0 replies; 58+ messages in thread
From: Greg KH @ 2020-03-24 16:27 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

On Tue, Mar 24, 2020 at 03:36:23PM +0000, Will Deacon wrote:
> Commit c54a2744497d ("list: Add hlist_unhashed_lockless()") added a
> "lockless" variant of hlist_unhashed() that uses READ_ONCE() to access
> the 'pprev' pointer of the 'hlist_node', the intention being that this
> could be used by 'timer_pending()' to silence a KCSAN warning. As well
> as forgetting to add the caller, the patch also sprinkles
> {READ,WRITE}_ONCE() invocations over the standard (i.e. non-RCU) hlist
> code, which is undesirable for a number of reasons:
> 
>   1. It gives the misleading impression that the non-RCU hlist code is
>      in some way lock-free (despite the notable absence of any memory
>      barriers!) and silences KCSAN in such cases.
> 
>   2. It unnecessarily penalises code generation for non-RCU hlist users
> 
>   3. It makes it difficult to introduce list integrity checks because
>      of the possibility of concurrent callers.
> 
> Retain the {READ,WRITE}_ONCE() invocations for the RCU hlist code, but
> remove them from the non-RCU implementation. Remove the unused
> 'hlist_unhashed_lockless()' function entirely and add the READ_ONCE()
> to hlist_unhashed(), as we do already for hlist_empty() already.
> 
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  include/linux/list.h | 33 ++++++++++-----------------------
>  1 file changed, 10 insertions(+), 23 deletions(-)

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 02/21] list: Remove hlist_nulls_unhashed_lockless()
  2020-03-24 15:36 ` [RFC PATCH 02/21] list: Remove hlist_nulls_unhashed_lockless() Will Deacon
@ 2020-03-24 16:27   ` Greg KH
  2020-03-30 23:07   ` Paul E. McKenney
  1 sibling, 0 replies; 58+ messages in thread
From: Greg KH @ 2020-03-24 16:27 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

On Tue, Mar 24, 2020 at 03:36:24PM +0000, Will Deacon wrote:
> Commit 02b99b38f3d9 ("rcu: Add a hlist_nulls_unhashed_lockless() function")
> introduced the (as yet unused) hlist_nulls_unhashed_lockless() function
> to avoid KCSAN reports when an RCU reader checks the 'hashed' status
> of an 'hlist_nulls' concurrently undergoing modification.
> 
> Remove the unused function and add a READ_ONCE() to hlist_nulls_unhashed(),
> just like we do already for hlist_nulls_empty().
> 
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
>  include/linux/list_nulls.h | 14 --------------
>  1 file changed, 14 deletions(-)

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 17/21] linux/bit_spinlock.h: Include linux/processor.h
  2020-03-24 15:36 ` [RFC PATCH 17/21] linux/bit_spinlock.h: Include linux/processor.h Will Deacon
@ 2020-03-24 16:28   ` Greg KH
  2020-03-24 21:08     ` Will Deacon
  0 siblings, 1 reply; 58+ messages in thread
From: Greg KH @ 2020-03-24 16:28 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

On Tue, Mar 24, 2020 at 03:36:39PM +0000, Will Deacon wrote:
> Needed for cpu_relax().

Is this needed now, or for the next patch?  And if for next patch, why
not just add it then?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 04/21] timers: Use hlist_unhashed() instead of open-coding in timer_pending()
  2020-03-24 15:36 ` [RFC PATCH 04/21] timers: Use hlist_unhashed() instead of open-coding in timer_pending() Will Deacon
@ 2020-03-24 16:30   ` Greg KH
  0 siblings, 0 replies; 58+ messages in thread
From: Greg KH @ 2020-03-24 16:30 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

On Tue, Mar 24, 2020 at 03:36:26PM +0000, Will Deacon wrote:
> timer_pending() open-codes a version of hlist_unhashed() to check
> whether or not the 'timer' parameter has been queued in the timer wheel.
> KCSAN detects this as a racy operation and explodes at us:
> 
>   | BUG: KCSAN: data-race in del_timer / detach_if_pending
>   |
>   | write to 0xffff88808697d870 of 8 bytes by task 10 on cpu 0:
>   |  __hlist_del include/linux/list.h:764 [inline]
>   |  detach_timer kernel/time/timer.c:815 [inline]
>   |  detach_if_pending+0xcd/0x2d0 kernel/time/timer.c:832
>   |  try_to_del_timer_sync+0x60/0xb0 kernel/time/timer.c:1226
>   |  del_timer_sync+0x6b/0xa0 kernel/time/timer.c:1365
>   |  schedule_timeout+0x2d2/0x6e0 kernel/time/timer.c:1896
>   |  rcu_gp_fqs_loop+0x37c/0x580 kernel/rcu/tree.c:1639
>   |  rcu_gp_kthread+0x143/0x230 kernel/rcu/tree.c:1799
>   |  kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
>   |  ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352
>   |
>   | read to 0xffff88808697d870 of 8 bytes by task 12060 on cpu 1:
>   |  del_timer+0x3b/0xb0 kernel/time/timer.c:1198
>   |  sk_stop_timer+0x25/0x60 net/core/sock.c:2845
>   |  inet_csk_clear_xmit_timers+0x69/0xa0 net/ipv4/inet_connection_sock.c:523
>   |  tcp_clear_xmit_timers include/net/tcp.h:606 [inline]
>   |  tcp_v4_destroy_sock+0xa3/0x3f0 net/ipv4/tcp_ipv4.c:2096
>   |  inet_csk_destroy_sock+0xf4/0x250 net/ipv4/inet_connection_sock.c:836
>   |  tcp_close+0x6f3/0x970 net/ipv4/tcp.c:2497
>   |  inet_release+0x86/0x100 net/ipv4/af_inet.c:427
>   |  __sock_release+0x85/0x160 net/socket.c:590
>   |  sock_close+0x24/0x30 net/socket.c:1268
>   |  __fput+0x1e1/0x520 fs/file_table.c:280
>   |  ____fput+0x1f/0x30 fs/file_table.c:313
>   |  task_work_run+0xf6/0x130 kernel/task_work.c:113
>   |  tracehook_notify_resume include/linux/tracehook.h:188 [inline]
>   |  exit_to_usermode_loop+0x2b4/0x2c0 arch/x86/entry/common.c:163
> 
> Replace the explicit 'pprev' pointer comparison in timer_pending() with
> a call to hlist_unhashed() and initialise the 'expires' timer field
> explicitly in do_init_timer() so that the compiler doesn't emit bogus
> 'maybe used uninitialised' warnings now that it cannot reason statically
> about the result of timer_pending().
> 
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Will Deacon <will@kernel.org>

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race()
  2020-03-24 16:26     ` Greg KH
@ 2020-03-24 16:38       ` Jann Horn
  2020-03-24 16:59         ` Greg KH
  0 siblings, 1 reply; 58+ messages in thread
From: Jann Horn @ 2020-03-24 16:38 UTC (permalink / raw)
  To: Greg KH
  Cc: Will Deacon, kernel list, Eric Dumazet, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, Kernel Hardening

On Tue, Mar 24, 2020 at 5:26 PM Greg KH <greg@kroah.com> wrote:
> On Tue, Mar 24, 2020 at 05:20:45PM +0100, Jann Horn wrote:
> > On Tue, Mar 24, 2020 at 4:37 PM Will Deacon <will@kernel.org> wrote:
> > > Some list predicates can be used locklessly even with the non-RCU list
> > > implementations, since they effectively boil down to a test against
> > > NULL. For example, checking whether or not a list is empty is safe even
> > > in the presence of a concurrent, tearing write to the list head pointer.
> > > Similarly, checking whether or not an hlist node has been hashed is safe
> > > as well.
> > >
> > > Annotate these lockless list predicates with data_race() and READ_ONCE()
> > > so that KCSAN and the compiler are aware of what's going on. The writer
> > > side can then avoid having to use WRITE_ONCE() in the non-RCU
> > > implementation.
> > [...]
> > >  static inline int list_empty(const struct list_head *head)
> > >  {
> > > -       return READ_ONCE(head->next) == head;
> > > +       return data_race(READ_ONCE(head->next) == head);
> > >  }
> > [...]
> > >  static inline int hlist_unhashed(const struct hlist_node *h)
> > >  {
> > > -       return !READ_ONCE(h->pprev);
> > > +       return data_race(!READ_ONCE(h->pprev));
> > >  }
> >
> > This is probably valid in practice for hlist_unhashed(), which
> > compares with NULL, as long as the most significant byte of all kernel
> > pointers is non-zero; but I think list_empty() could realistically
> > return false positives in the presence of a concurrent tearing store?
> > This could break the following code pattern:
> >
> > /* optimistic lockless check */
> > if (!list_empty(&some_list)) {
> >   /* slowpath */
> >   mutex_lock(&some_mutex);
> >   list_for_each(tmp, &some_list) {
> >     ...
> >   }
> >   mutex_unlock(&some_mutex);
> > }
> >
> > (I'm not sure whether patterns like this appear commonly though.)
>
>
> I would hope not as the list could go "empty" before the lock is
> grabbed.  That pattern would be wrong.

If the list becomes empty in between, the loop just iterates over
nothing, and the effect is no different from what you'd get if you had
bailed out before. But sure, you have to be aware that that can
happen.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 20/21] list: Format CHECK_DATA_CORRUPTION error messages consistently
  2020-03-24 15:36 ` [RFC PATCH 20/21] list: Format CHECK_DATA_CORRUPTION error messages consistently Will Deacon
@ 2020-03-24 16:40   ` Greg KH
  0 siblings, 0 replies; 58+ messages in thread
From: Greg KH @ 2020-03-24 16:40 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

On Tue, Mar 24, 2020 at 03:36:42PM +0000, Will Deacon wrote:
> The error strings printed when list data corruption is detected are
> formatted inconsistently.
> 
> Satisfy my inner-pedant by consistently using ':' to limit the message
> from its prefix and drop the terminating full stops where they exist.
> 
> Signed-off-by: Will Deacon <will@kernel.org>

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 10/21] kernel-hacking: Make DEBUG_{LIST,PLIST,SG,NOTIFIERS} non-debug options
  2020-03-24 15:36 ` [RFC PATCH 10/21] kernel-hacking: Make DEBUG_{LIST,PLIST,SG,NOTIFIERS} non-debug options Will Deacon
@ 2020-03-24 16:42   ` Greg KH
  0 siblings, 0 replies; 58+ messages in thread
From: Greg KH @ 2020-03-24 16:42 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

On Tue, Mar 24, 2020 at 03:36:32PM +0000, Will Deacon wrote:
> The CONFIG_DEBUG_{LIST,PLIST,SG,NOTIFIERS} options can provide useful
> security hardening properties outside of debug scenarios. For example,
> CVE-2019-2215 and CVE-2019-2025 are mitigated with negligible runtime
> overhead by enabling CONFIG_DEBUG_LIST, and this option is already
> enabled by default on many distributions:
> 
> https://googleprojectzero.blogspot.com/2019/11/bad-binder-android-in-wild-exploit.html
> 
> Rename these options across the tree so that the naming better reflects
> their operation and remove the dependency on DEBUG_KERNEL.
> 
> Cc: Maddie Stone <maddiestone@google.com>
> Cc: Jann Horn <jannh@google.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Will Deacon <will@kernel.org>

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 14/21] plist: Use CHECK_DATA_CORRUPTION instead of explicit {BUG,WARN}_ON()
  2020-03-24 15:36 ` [RFC PATCH 14/21] plist: Use CHECK_DATA_CORRUPTION instead of explicit {BUG,WARN}_ON() Will Deacon
@ 2020-03-24 16:42   ` Greg KH
  0 siblings, 0 replies; 58+ messages in thread
From: Greg KH @ 2020-03-24 16:42 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

On Tue, Mar 24, 2020 at 03:36:36PM +0000, Will Deacon wrote:
> CHECK_DATA_CORRUPTION() allows detected data corruption to result
> consistently in either a BUG() or a WARN() depending on
> CONFIG_BUG_ON_DATA_CORRUPTION.
> 
> Use CHECK_DATA_CORRUPTION() to report plist integrity checking failures,
> rather than explicit use of BUG_ON() and WARN_ON().
> 
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Will Deacon <will@kernel.org>

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race()
  2020-03-24 15:36 ` [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race() Will Deacon
  2020-03-24 16:20   ` Jann Horn
  2020-03-24 16:23   ` Marco Elver
@ 2020-03-24 16:51   ` Peter Zijlstra
  2020-03-24 16:56     ` Jann Horn
  2 siblings, 1 reply; 58+ messages in thread
From: Peter Zijlstra @ 2020-03-24 16:51 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Thomas Gleixner, kernel-team,
	kernel-hardening, Oleg Nesterov

On Tue, Mar 24, 2020 at 03:36:25PM +0000, Will Deacon wrote:
> diff --git a/include/linux/list.h b/include/linux/list.h
> index 4fed5a0f9b77..4d9f5f9ed1a8 100644
> --- a/include/linux/list.h
> +++ b/include/linux/list.h
> @@ -279,7 +279,7 @@ static inline int list_is_last(const struct list_head *list,
>   */
>  static inline int list_empty(const struct list_head *head)
>  {
> -	return READ_ONCE(head->next) == head;
> +	return data_race(READ_ONCE(head->next) == head);
>  }

list_empty() isn't lockless safe, that's what we have
list_empty_careful() for.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race()
  2020-03-24 16:51   ` [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race() Peter Zijlstra
@ 2020-03-24 16:56     ` Jann Horn
  2020-03-24 21:32       ` Will Deacon
  0 siblings, 1 reply; 58+ messages in thread
From: Jann Horn @ 2020-03-24 16:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Will Deacon, kernel list, Eric Dumazet, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Thomas Gleixner, kernel-team,
	Kernel Hardening, Oleg Nesterov

On Tue, Mar 24, 2020 at 5:51 PM Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Mar 24, 2020 at 03:36:25PM +0000, Will Deacon wrote:
> > diff --git a/include/linux/list.h b/include/linux/list.h
> > index 4fed5a0f9b77..4d9f5f9ed1a8 100644
> > --- a/include/linux/list.h
> > +++ b/include/linux/list.h
> > @@ -279,7 +279,7 @@ static inline int list_is_last(const struct list_head *list,
> >   */
> >  static inline int list_empty(const struct list_head *head)
> >  {
> > -     return READ_ONCE(head->next) == head;
> > +     return data_race(READ_ONCE(head->next) == head);
> >  }
>
> list_empty() isn't lockless safe, that's what we have
> list_empty_careful() for.

That thing looks like it could also use some READ_ONCE() sprinkled in...

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race()
  2020-03-24 16:38       ` Jann Horn
@ 2020-03-24 16:59         ` Greg KH
  2020-03-24 18:22           ` Jann Horn
  0 siblings, 1 reply; 58+ messages in thread
From: Greg KH @ 2020-03-24 16:59 UTC (permalink / raw)
  To: Jann Horn
  Cc: Will Deacon, kernel list, Eric Dumazet, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, Kernel Hardening

On Tue, Mar 24, 2020 at 05:38:30PM +0100, Jann Horn wrote:
> On Tue, Mar 24, 2020 at 5:26 PM Greg KH <greg@kroah.com> wrote:
> > On Tue, Mar 24, 2020 at 05:20:45PM +0100, Jann Horn wrote:
> > > On Tue, Mar 24, 2020 at 4:37 PM Will Deacon <will@kernel.org> wrote:
> > > > Some list predicates can be used locklessly even with the non-RCU list
> > > > implementations, since they effectively boil down to a test against
> > > > NULL. For example, checking whether or not a list is empty is safe even
> > > > in the presence of a concurrent, tearing write to the list head pointer.
> > > > Similarly, checking whether or not an hlist node has been hashed is safe
> > > > as well.
> > > >
> > > > Annotate these lockless list predicates with data_race() and READ_ONCE()
> > > > so that KCSAN and the compiler are aware of what's going on. The writer
> > > > side can then avoid having to use WRITE_ONCE() in the non-RCU
> > > > implementation.
> > > [...]
> > > >  static inline int list_empty(const struct list_head *head)
> > > >  {
> > > > -       return READ_ONCE(head->next) == head;
> > > > +       return data_race(READ_ONCE(head->next) == head);
> > > >  }
> > > [...]
> > > >  static inline int hlist_unhashed(const struct hlist_node *h)
> > > >  {
> > > > -       return !READ_ONCE(h->pprev);
> > > > +       return data_race(!READ_ONCE(h->pprev));
> > > >  }
> > >
> > > This is probably valid in practice for hlist_unhashed(), which
> > > compares with NULL, as long as the most significant byte of all kernel
> > > pointers is non-zero; but I think list_empty() could realistically
> > > return false positives in the presence of a concurrent tearing store?
> > > This could break the following code pattern:
> > >
> > > /* optimistic lockless check */
> > > if (!list_empty(&some_list)) {
> > >   /* slowpath */
> > >   mutex_lock(&some_mutex);
> > >   list_for_each(tmp, &some_list) {
> > >     ...
> > >   }
> > >   mutex_unlock(&some_mutex);
> > > }
> > >
> > > (I'm not sure whether patterns like this appear commonly though.)
> >
> >
> > I would hope not as the list could go "empty" before the lock is
> > grabbed.  That pattern would be wrong.
> 
> If the list becomes empty in between, the loop just iterates over
> nothing, and the effect is no different from what you'd get if you had
> bailed out before. But sure, you have to be aware that that can
> happen.

Doh, yeah, so it is safe, crazy, but safe :)

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race()
  2020-03-24 16:59         ` Greg KH
@ 2020-03-24 18:22           ` Jann Horn
  0 siblings, 0 replies; 58+ messages in thread
From: Jann Horn @ 2020-03-24 18:22 UTC (permalink / raw)
  To: Greg KH, Will Deacon, Peter Zijlstra
  Cc: kernel list, Eric Dumazet, Kees Cook, Maddie Stone, Marco Elver,
	Paul E . McKenney, Thomas Gleixner, kernel-team,
	Kernel Hardening, Ingo Molnar

On Tue, Mar 24, 2020 at 5:59 PM Greg KH <greg@kroah.com> wrote:
> On Tue, Mar 24, 2020 at 05:38:30PM +0100, Jann Horn wrote:
> > On Tue, Mar 24, 2020 at 5:26 PM Greg KH <greg@kroah.com> wrote:
> > > On Tue, Mar 24, 2020 at 05:20:45PM +0100, Jann Horn wrote:
> > > > On Tue, Mar 24, 2020 at 4:37 PM Will Deacon <will@kernel.org> wrote:
> > > > > Some list predicates can be used locklessly even with the non-RCU list
> > > > > implementations, since they effectively boil down to a test against
> > > > > NULL. For example, checking whether or not a list is empty is safe even
> > > > > in the presence of a concurrent, tearing write to the list head pointer.
> > > > > Similarly, checking whether or not an hlist node has been hashed is safe
> > > > > as well.
> > > > >
> > > > > Annotate these lockless list predicates with data_race() and READ_ONCE()
> > > > > so that KCSAN and the compiler are aware of what's going on. The writer
> > > > > side can then avoid having to use WRITE_ONCE() in the non-RCU
> > > > > implementation.
> > > > [...]
> > > > >  static inline int list_empty(const struct list_head *head)
> > > > >  {
> > > > > -       return READ_ONCE(head->next) == head;
> > > > > +       return data_race(READ_ONCE(head->next) == head);
> > > > >  }
> > > > [...]
> > > > >  static inline int hlist_unhashed(const struct hlist_node *h)
> > > > >  {
> > > > > -       return !READ_ONCE(h->pprev);
> > > > > +       return data_race(!READ_ONCE(h->pprev));
> > > > >  }
> > > >
> > > > This is probably valid in practice for hlist_unhashed(), which
> > > > compares with NULL, as long as the most significant byte of all kernel
> > > > pointers is non-zero; but I think list_empty() could realistically
> > > > return false positives in the presence of a concurrent tearing store?
> > > > This could break the following code pattern:
> > > >
> > > > /* optimistic lockless check */
> > > > if (!list_empty(&some_list)) {
> > > >   /* slowpath */
> > > >   mutex_lock(&some_mutex);
> > > >   list_for_each(tmp, &some_list) {
> > > >     ...
> > > >   }
> > > >   mutex_unlock(&some_mutex);
> > > > }
> > > >
> > > > (I'm not sure whether patterns like this appear commonly though.)
> > >
> > >
> > > I would hope not as the list could go "empty" before the lock is
> > > grabbed.  That pattern would be wrong.
> >
> > If the list becomes empty in between, the loop just iterates over
> > nothing, and the effect is no different from what you'd get if you had
> > bailed out before. But sure, you have to be aware that that can
> > happen.
>
> Doh, yeah, so it is safe, crazy, but safe :)

Here's an example of that pattern, I think (which I think is
technically incorrect if what peterz said is accurate?):

/**
 * waitqueue_active -- locklessly test for waiters on the queue
 * @wq_head: the waitqueue to test for waiters
 *
 * returns true if the wait list is not empty
 *
 * NOTE: this function is lockless and requires care, incorrect usage _will_
 * lead to sporadic and non-obvious failure.
 *
 * Use either while holding wait_queue_head::lock or when used for wakeups
 * with an extra smp_mb() like::
 *
 *      CPU0 - waker                    CPU1 - waiter
 *
 *                                      for (;;) {
 *      @cond = true;                     prepare_to_wait(&wq_head,
&wait, state);
 *      smp_mb();                         // smp_mb() from set_current_state()
 *      if (waitqueue_active(wq_head))         if (@cond)
 *        wake_up(wq_head);                      break;
 *                                        schedule();
 *                                      }
 *                                      finish_wait(&wq_head, &wait);
 *
 * Because without the explicit smp_mb() it's possible for the
 * waitqueue_active() load to get hoisted over the @cond store such that we'll
 * observe an empty wait list while the waiter might not observe @cond.
 *
 * Also note that this 'optimization' trades a spin_lock() for an smp_mb(),
 * which (when the lock is uncontended) are of roughly equal cost.
 */
static inline int waitqueue_active(struct wait_queue_head *wq_head)
{
        return !list_empty(&wq_head->head);
}

void signalfd_cleanup(struct sighand_struct *sighand)
{
        wait_queue_head_t *wqh = &sighand->signalfd_wqh;
        /*
         * The lockless check can race with remove_wait_queue() in progress,
         * but in this case its caller should run under rcu_read_lock() and
         * sighand_cachep is SLAB_TYPESAFE_BY_RCU, we can safely return.
         */
        if (likely(!waitqueue_active(wqh)))
                return;

        /* wait_queue_entry_t->func(POLLFREE) should do remove_wait_queue() */
        wake_up_poll(wqh, EPOLLHUP | POLLFREE);
}

and __add_wait_queue() just uses plain list_add(&wq_entry->entry,
&wq_head->head) under a lock.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 17/21] linux/bit_spinlock.h: Include linux/processor.h
  2020-03-24 16:28   ` Greg KH
@ 2020-03-24 21:08     ` Will Deacon
  0 siblings, 0 replies; 58+ messages in thread
From: Will Deacon @ 2020-03-24 21:08 UTC (permalink / raw)
  To: Greg KH
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Paul E . McKenney, Peter Zijlstra, Thomas Gleixner,
	kernel-team, kernel-hardening

On Tue, Mar 24, 2020 at 05:28:20PM +0100, Greg KH wrote:
> On Tue, Mar 24, 2020 at 03:36:39PM +0000, Will Deacon wrote:
> > Needed for cpu_relax().
> 
> Is this needed now, or for the next patch?  And if for next patch, why
> not just add it then?

If the next patch correctly included linux/list_bl.h like it was supposed
to, then yes, it's needed there. I'll sort that out and fold this in for
v2.

Cheers,

Will

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race()
  2020-03-24 16:56     ` Jann Horn
@ 2020-03-24 21:32       ` Will Deacon
  2020-03-30 23:13         ` Paul E. McKenney
  0 siblings, 1 reply; 58+ messages in thread
From: Will Deacon @ 2020-03-24 21:32 UTC (permalink / raw)
  To: Jann Horn
  Cc: Peter Zijlstra, kernel list, Eric Dumazet, Kees Cook,
	Maddie Stone, Marco Elver, Paul E . McKenney, Thomas Gleixner,
	kernel-team, Kernel Hardening, Oleg Nesterov

[mutt crashed while I was sending this; apologies if you receive it twice]

On Tue, Mar 24, 2020 at 05:56:15PM +0100, Jann Horn wrote:
> On Tue, Mar 24, 2020 at 5:51 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > On Tue, Mar 24, 2020 at 03:36:25PM +0000, Will Deacon wrote:
> > > diff --git a/include/linux/list.h b/include/linux/list.h
> > > index 4fed5a0f9b77..4d9f5f9ed1a8 100644
> > > --- a/include/linux/list.h
> > > +++ b/include/linux/list.h
> > > @@ -279,7 +279,7 @@ static inline int list_is_last(const struct list_head *list,
> > >   */
> > >  static inline int list_empty(const struct list_head *head)
> > >  {
> > > -     return READ_ONCE(head->next) == head;
> > > +     return data_race(READ_ONCE(head->next) == head);
> > >  }
> >
> > list_empty() isn't lockless safe, that's what we have
> > list_empty_careful() for.
> 
> That thing looks like it could also use some READ_ONCE() sprinkled in...

Crikey, how did I miss that? I need to spend some time understanding the
ordering there.

So it sounds like the KCSAN splats relating to list_empty() and loosely
referred to by 1c97be677f72 ("list: Use WRITE_ONCE() when adding to lists
and hlists") are indicative of real bugs and we should actually restore
list_empty() to its former glory prior to 1658d35ead5d ("list: Use
READ_ONCE() when testing for empty lists"). Alternatively, assuming
list_empty_careful() does what it says on the tin, we could just make that
the default.

Will

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race()
  2020-03-24 16:23   ` Marco Elver
@ 2020-03-24 21:33     ` Will Deacon
  2020-03-31 13:10     ` Will Deacon
  1 sibling, 0 replies; 58+ messages in thread
From: Will Deacon @ 2020-03-24 21:33 UTC (permalink / raw)
  To: Marco Elver
  Cc: LKML, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Paul E . McKenney, Peter Zijlstra, Thomas Gleixner, kernel-team,
	kernel-hardening

On Tue, Mar 24, 2020 at 05:23:30PM +0100, Marco Elver wrote:
> On Tue, 24 Mar 2020 at 16:37, Will Deacon <will@kernel.org> wrote:
> >
> > Some list predicates can be used locklessly even with the non-RCU list
> > implementations, since they effectively boil down to a test against
> > NULL. For example, checking whether or not a list is empty is safe even
> > in the presence of a concurrent, tearing write to the list head pointer.
> > Similarly, checking whether or not an hlist node has been hashed is safe
> > as well.
> >
> > Annotate these lockless list predicates with data_race() and READ_ONCE()
> > so that KCSAN and the compiler are aware of what's going on. The writer
> > side can then avoid having to use WRITE_ONCE() in the non-RCU
> > implementation.
> >
> > Cc: Paul E. McKenney <paulmck@kernel.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Marco Elver <elver@google.com>
> > Signed-off-by: Will Deacon <will@kernel.org>
> > ---
> >  include/linux/list.h       | 10 +++++-----
> >  include/linux/list_bl.h    |  5 +++--
> >  include/linux/list_nulls.h |  6 +++---
> >  include/linux/llist.h      |  2 +-
> >  4 files changed, 12 insertions(+), 11 deletions(-)
> >
> > diff --git a/include/linux/list.h b/include/linux/list.h
> > index 4fed5a0f9b77..4d9f5f9ed1a8 100644
> > --- a/include/linux/list.h
> > +++ b/include/linux/list.h
> > @@ -279,7 +279,7 @@ static inline int list_is_last(const struct list_head *list,
> >   */
> >  static inline int list_empty(const struct list_head *head)
> >  {
> > -       return READ_ONCE(head->next) == head;
> > +       return data_race(READ_ONCE(head->next) == head);
> 
> Double-marking should never be necessary, at least if you want to make
> KCSAN happy. From what I gather there is an unmarked write somewhere,
> correct? In that case, KCSAN will still complain because if it sees a
> race between this read and the other write, then at least one is still
> plain (the write).
> 
> Then, my suggestion would be to mark the write with data_race() and
> just leave this as a READ_ONCE(). Having a data_race() somewhere only
> makes KCSAN stop reporting the race if the paired access is also
> marked (be it with data_race() or _ONCE, etc.).
> 
> Alternatively, if marking the write is impossible, you can surround
> the access with kcsan_disable_current()/kcsan_enable_current(). Or, as
> a last resort, just leaving as-is is fine too, because KCSAN's default
> config (still) has KCSAN_ASSUME_PLAIN_WRITES_ATOMIC selected.

Right, it looks like this is a bit of a smoking gun and we need to decide
on whether list_empty() is actually usable without synchronisation first.
Based on the outcome of that discussion, I'll update this patch accordingly.

The main thing I want to avoid is marking parts of the non-RCU list
implementation with data_race() or {READ,WRITE}_ONCE() because that's a
sure-fire way to hide real bugs.

Cheers,

Will

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 01/21] list: Remove hlist_unhashed_lockless()
  2020-03-24 15:36 ` [RFC PATCH 01/21] list: Remove hlist_unhashed_lockless() Will Deacon
  2020-03-24 16:27   ` Greg KH
@ 2020-03-30 23:05   ` Paul E. McKenney
  1 sibling, 0 replies; 58+ messages in thread
From: Paul E. McKenney @ 2020-03-30 23:05 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Peter Zijlstra, Thomas Gleixner, kernel-team,
	kernel-hardening

On Tue, Mar 24, 2020 at 03:36:23PM +0000, Will Deacon wrote:
> Commit c54a2744497d ("list: Add hlist_unhashed_lockless()") added a
> "lockless" variant of hlist_unhashed() that uses READ_ONCE() to access
> the 'pprev' pointer of the 'hlist_node', the intention being that this
> could be used by 'timer_pending()' to silence a KCSAN warning. As well
> as forgetting to add the caller, the patch also sprinkles
> {READ,WRITE}_ONCE() invocations over the standard (i.e. non-RCU) hlist
> code, which is undesirable for a number of reasons:
> 
>   1. It gives the misleading impression that the non-RCU hlist code is
>      in some way lock-free (despite the notable absence of any memory
>      barriers!) and silences KCSAN in such cases.
> 
>   2. It unnecessarily penalises code generation for non-RCU hlist users
> 
>   3. It makes it difficult to introduce list integrity checks because
>      of the possibility of concurrent callers.
> 
> Retain the {READ,WRITE}_ONCE() invocations for the RCU hlist code, but
> remove them from the non-RCU implementation. Remove the unused
> 'hlist_unhashed_lockless()' function entirely and add the READ_ONCE()
> to hlist_unhashed(), as we do already for hlist_empty() already.
> 
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Will Deacon <will@kernel.org>

No objection, however 90c018942c2b ("timer: Use hlist_unhashed_lockless()
in timer_pending()") in -tip must change from hlist_unhashed_lockless()
to hlist_unhashed().  Easy fix, though, so:

Acked-by: Paul E. McKenney <paulmck@kernel.org>

> ---
>  include/linux/list.h | 33 ++++++++++-----------------------
>  1 file changed, 10 insertions(+), 23 deletions(-)
> 
> diff --git a/include/linux/list.h b/include/linux/list.h
> index 884216db3246..4fed5a0f9b77 100644
> --- a/include/linux/list.h
> +++ b/include/linux/list.h
> @@ -777,19 +777,6 @@ static inline void INIT_HLIST_NODE(struct hlist_node *h)
>   * node in unhashed state, but hlist_nulls_del() does not.
>   */
>  static inline int hlist_unhashed(const struct hlist_node *h)
> -{
> -	return !h->pprev;
> -}
> -
> -/**
> - * hlist_unhashed_lockless - Version of hlist_unhashed for lockless use
> - * @h: Node to be checked
> - *
> - * This variant of hlist_unhashed() must be used in lockless contexts
> - * to avoid potential load-tearing.  The READ_ONCE() is paired with the
> - * various WRITE_ONCE() in hlist helpers that are defined below.
> - */
> -static inline int hlist_unhashed_lockless(const struct hlist_node *h)
>  {
>  	return !READ_ONCE(h->pprev);
>  }
> @@ -852,11 +839,11 @@ static inline void hlist_del_init(struct hlist_node *n)
>  static inline void hlist_add_head(struct hlist_node *n, struct hlist_head *h)
>  {
>  	struct hlist_node *first = h->first;
> -	WRITE_ONCE(n->next, first);
> +	n->next = first;
>  	if (first)
> -		WRITE_ONCE(first->pprev, &n->next);
> +		first->pprev = &n->next;
>  	WRITE_ONCE(h->first, n);
> -	WRITE_ONCE(n->pprev, &h->first);
> +	n->pprev = &h->first;
>  }
>  
>  /**
> @@ -867,9 +854,9 @@ static inline void hlist_add_head(struct hlist_node *n, struct hlist_head *h)
>  static inline void hlist_add_before(struct hlist_node *n,
>  				    struct hlist_node *next)
>  {
> -	WRITE_ONCE(n->pprev, next->pprev);
> -	WRITE_ONCE(n->next, next);
> -	WRITE_ONCE(next->pprev, &n->next);
> +	n->pprev = next->pprev;
> +	n->next = next;
> +	next->pprev = &n->next;
>  	WRITE_ONCE(*(n->pprev), n);
>  }
>  
> @@ -881,12 +868,12 @@ static inline void hlist_add_before(struct hlist_node *n,
>  static inline void hlist_add_behind(struct hlist_node *n,
>  				    struct hlist_node *prev)
>  {
> -	WRITE_ONCE(n->next, prev->next);
> -	WRITE_ONCE(prev->next, n);
> -	WRITE_ONCE(n->pprev, &prev->next);
> +	n->next = prev->next;
> +	prev->next = n;
> +	n->pprev = &prev->next;
>  
>  	if (n->next)
> -		WRITE_ONCE(n->next->pprev, &n->next);
> +		n->next->pprev  = &n->next;
>  }
>  
>  /**
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 02/21] list: Remove hlist_nulls_unhashed_lockless()
  2020-03-24 15:36 ` [RFC PATCH 02/21] list: Remove hlist_nulls_unhashed_lockless() Will Deacon
  2020-03-24 16:27   ` Greg KH
@ 2020-03-30 23:07   ` Paul E. McKenney
  1 sibling, 0 replies; 58+ messages in thread
From: Paul E. McKenney @ 2020-03-30 23:07 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Peter Zijlstra, Thomas Gleixner, kernel-team,
	kernel-hardening

On Tue, Mar 24, 2020 at 03:36:24PM +0000, Will Deacon wrote:
> Commit 02b99b38f3d9 ("rcu: Add a hlist_nulls_unhashed_lockless() function")
> introduced the (as yet unused) hlist_nulls_unhashed_lockless() function
> to avoid KCSAN reports when an RCU reader checks the 'hashed' status
> of an 'hlist_nulls' concurrently undergoing modification.
> 
> Remove the unused function and add a READ_ONCE() to hlist_nulls_unhashed(),
> just like we do already for hlist_nulls_empty().
> 
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Will Deacon <will@kernel.org>

Well, I guess that the added docbook comment might be seen as worthwhile.

Acked-by: Paul E. McKenney <paulmck@kernel.org>

> ---
>  include/linux/list_nulls.h | 14 --------------
>  1 file changed, 14 deletions(-)
> 
> diff --git a/include/linux/list_nulls.h b/include/linux/list_nulls.h
> index fa6e8471bd22..3a9ff01e9a11 100644
> --- a/include/linux/list_nulls.h
> +++ b/include/linux/list_nulls.h
> @@ -65,20 +65,6 @@ static inline unsigned long get_nulls_value(const struct hlist_nulls_node *ptr)
>   * but hlist_nulls_del() does not.
>   */
>  static inline int hlist_nulls_unhashed(const struct hlist_nulls_node *h)
> -{
> -	return !h->pprev;
> -}
> -
> -/**
> - * hlist_nulls_unhashed_lockless - Has node been removed and reinitialized?
> - * @h: Node to be checked
> - *
> - * Not that not all removal functions will leave a node in unhashed state.
> - * For example, hlist_del_init_rcu() leaves the node in unhashed state,
> - * but hlist_nulls_del() does not.  Unlike hlist_nulls_unhashed(), this
> - * function may be used locklessly.
> - */
> -static inline int hlist_nulls_unhashed_lockless(const struct hlist_nulls_node *h)
>  {
>  	return !READ_ONCE(h->pprev);
>  }
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race()
  2020-03-24 21:32       ` Will Deacon
@ 2020-03-30 23:13         ` Paul E. McKenney
  2020-04-24 17:39           ` Will Deacon
  0 siblings, 1 reply; 58+ messages in thread
From: Paul E. McKenney @ 2020-03-30 23:13 UTC (permalink / raw)
  To: Will Deacon
  Cc: Jann Horn, Peter Zijlstra, kernel list, Eric Dumazet, Kees Cook,
	Maddie Stone, Marco Elver, Thomas Gleixner, kernel-team,
	Kernel Hardening, Oleg Nesterov

On Tue, Mar 24, 2020 at 09:32:01PM +0000, Will Deacon wrote:
> [mutt crashed while I was sending this; apologies if you receive it twice]
> 
> On Tue, Mar 24, 2020 at 05:56:15PM +0100, Jann Horn wrote:
> > On Tue, Mar 24, 2020 at 5:51 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > > On Tue, Mar 24, 2020 at 03:36:25PM +0000, Will Deacon wrote:
> > > > diff --git a/include/linux/list.h b/include/linux/list.h
> > > > index 4fed5a0f9b77..4d9f5f9ed1a8 100644
> > > > --- a/include/linux/list.h
> > > > +++ b/include/linux/list.h
> > > > @@ -279,7 +279,7 @@ static inline int list_is_last(const struct list_head *list,
> > > >   */
> > > >  static inline int list_empty(const struct list_head *head)
> > > >  {
> > > > -     return READ_ONCE(head->next) == head;
> > > > +     return data_race(READ_ONCE(head->next) == head);
> > > >  }
> > >
> > > list_empty() isn't lockless safe, that's what we have
> > > list_empty_careful() for.
> > 
> > That thing looks like it could also use some READ_ONCE() sprinkled in...
> 
> Crikey, how did I miss that? I need to spend some time understanding the
> ordering there.
> 
> So it sounds like the KCSAN splats relating to list_empty() and loosely
> referred to by 1c97be677f72 ("list: Use WRITE_ONCE() when adding to lists
> and hlists") are indicative of real bugs and we should actually restore
> list_empty() to its former glory prior to 1658d35ead5d ("list: Use
> READ_ONCE() when testing for empty lists"). Alternatively, assuming
> list_empty_careful() does what it says on the tin, we could just make that
> the default.

The list_empty_careful() function (suitably annotated) returns false if
the list is non-empty, including when it is in the process of becoming
either empty or non-empty.  It would be fine for the lockless use cases
I have come across.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 05/21] list: Comment missing WRITE_ONCE() in __list_del()
  2020-03-24 15:36 ` [RFC PATCH 05/21] list: Comment missing WRITE_ONCE() in __list_del() Will Deacon
@ 2020-03-30 23:14   ` Paul E. McKenney
  0 siblings, 0 replies; 58+ messages in thread
From: Paul E. McKenney @ 2020-03-30 23:14 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Peter Zijlstra, Thomas Gleixner, kernel-team,
	kernel-hardening

On Tue, Mar 24, 2020 at 03:36:27PM +0000, Will Deacon wrote:
> Although __list_del() is called from the RCU list implementation, it
> omits WRITE_ONCE() when updating the 'prev' pointer for the 'next' node.
> This is reasonable because concurrent RCU readers only access the 'next'
> pointers.
> 
> Although it's obvious after reading the code, add a comment.
> 
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Will Deacon <will@kernel.org>

OK, I will take the easy one.  ;-)

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

> ---
>  include/linux/list.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/linux/list.h b/include/linux/list.h
> index 4d9f5f9ed1a8..ec1f780d1449 100644
> --- a/include/linux/list.h
> +++ b/include/linux/list.h
> @@ -109,6 +109,7 @@ static inline void list_add_tail(struct list_head *new, struct list_head *head)
>   */
>  static inline void __list_del(struct list_head * prev, struct list_head * next)
>  {
> +	/* RCU readers don't read the 'prev' pointer */
>  	next->prev = prev;
>  	WRITE_ONCE(prev->next, next);
>  }
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 07/21] Revert "list: Use WRITE_ONCE() when adding to lists and hlists"
  2020-03-24 15:36 ` [RFC PATCH 07/21] Revert "list: Use WRITE_ONCE() when adding to lists and hlists" Will Deacon
@ 2020-03-30 23:19   ` Paul E. McKenney
  0 siblings, 0 replies; 58+ messages in thread
From: Paul E. McKenney @ 2020-03-30 23:19 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Peter Zijlstra, Thomas Gleixner, kernel-team,
	kernel-hardening

On Tue, Mar 24, 2020 at 03:36:29PM +0000, Will Deacon wrote:
> This reverts commit 1c97be677f72b3c338312aecd36d8fff20322f32.
> 
> There is no need to use WRITE_ONCE() for the non-RCU list and hlist
> implementations.
> 
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Will Deacon <will@kernel.org>

Which means that the lockless uses of hlist_empty() will also need
attention, correct?

							Thanx, Paul

> ---
>  include/linux/list.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/list.h b/include/linux/list.h
> index ec1f780d1449..c7331c259792 100644
> --- a/include/linux/list.h
> +++ b/include/linux/list.h
> @@ -70,7 +70,7 @@ static inline void __list_add(struct list_head *new,
>  	next->prev = new;
>  	new->next = next;
>  	new->prev = prev;
> -	WRITE_ONCE(prev->next, new);
> +	prev->next = new;
>  }
>  
>  /**
> @@ -843,7 +843,7 @@ static inline void hlist_add_head(struct hlist_node *n, struct hlist_head *h)
>  	n->next = first;
>  	if (first)
>  		first->pprev = &n->next;
> -	WRITE_ONCE(h->first, n);
> +	h->first = n;
>  	n->pprev = &h->first;
>  }
>  
> @@ -858,7 +858,7 @@ static inline void hlist_add_before(struct hlist_node *n,
>  	n->pprev = next->pprev;
>  	n->next = next;
>  	next->pprev = &n->next;
> -	WRITE_ONCE(*(n->pprev), n);
> +	*(n->pprev) = n;
>  }
>  
>  /**
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 06/21] list: Remove superfluous WRITE_ONCE() from hlist_nulls implementation
  2020-03-24 15:36 ` [RFC PATCH 06/21] list: Remove superfluous WRITE_ONCE() from hlist_nulls implementation Will Deacon
@ 2020-03-30 23:21   ` Paul E. McKenney
  0 siblings, 0 replies; 58+ messages in thread
From: Paul E. McKenney @ 2020-03-30 23:21 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Peter Zijlstra, Thomas Gleixner, kernel-team,
	kernel-hardening

On Tue, Mar 24, 2020 at 03:36:28PM +0000, Will Deacon wrote:
> Commit 860c8802ace1 ("rcu: Use WRITE_ONCE() for assignments to ->pprev
> for hlist_nulls") added WRITE_ONCE() invocations to hlist_nulls_add_head()
> and hlist_nulls_del().
> 
> Since these functions should not ordinarily run concurrently with other
> list accessors, restore the plain C assignments so that KCSAN can yell
> if a data race occurs.
> 
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Will Deacon <will@kernel.org>

And this means that the lockless uses of hlist_nulls_empty() need
attention, correct?

							Thanx, Paul

> ---
>  include/linux/list_nulls.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/list_nulls.h b/include/linux/list_nulls.h
> index fa51a801bf32..fd154ceb5b0d 100644
> --- a/include/linux/list_nulls.h
> +++ b/include/linux/list_nulls.h
> @@ -80,10 +80,10 @@ static inline void hlist_nulls_add_head(struct hlist_nulls_node *n,
>  	struct hlist_nulls_node *first = h->first;
>  
>  	n->next = first;
> -	WRITE_ONCE(n->pprev, &h->first);
> +	n->pprev = &h->first;
>  	h->first = n;
>  	if (!is_a_nulls(first))
> -		WRITE_ONCE(first->pprev, &n->next);
> +		first->pprev = &n->next;
>  }
>  
>  static inline void __hlist_nulls_del(struct hlist_nulls_node *n)
> @@ -99,7 +99,7 @@ static inline void __hlist_nulls_del(struct hlist_nulls_node *n)
>  static inline void hlist_nulls_del(struct hlist_nulls_node *n)
>  {
>  	__hlist_nulls_del(n);
> -	WRITE_ONCE(n->pprev, LIST_POISON2);
> +	n->pprev = LIST_POISON2;
>  }
>  
>  /**
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 08/21] Revert "list: Use WRITE_ONCE() when initializing list_head structures"
  2020-03-24 15:36 ` [RFC PATCH 08/21] Revert "list: Use WRITE_ONCE() when initializing list_head structures" Will Deacon
@ 2020-03-30 23:25   ` Paul E. McKenney
  2020-03-31 13:11     ` Will Deacon
  0 siblings, 1 reply; 58+ messages in thread
From: Paul E. McKenney @ 2020-03-30 23:25 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Peter Zijlstra, Thomas Gleixner, kernel-team,
	kernel-hardening

On Tue, Mar 24, 2020 at 03:36:30PM +0000, Will Deacon wrote:
> This reverts commit 2f073848c3cc8aff2655ab7c46d8c0de90cf4e50.
> 
> There is no need to use WRITE_ONCE() to initialise a non-RCU 'list_head'.
> 
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Will Deacon <will@kernel.org>

And attention to lockless uses of list_empty() here, correct?

Depending on the outcome of discussions on 3/21, I should have added in
all three cases.

							Thanx, Paul

> ---
>  include/linux/list.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/linux/list.h b/include/linux/list.h
> index c7331c259792..b86a3f9465d4 100644
> --- a/include/linux/list.h
> +++ b/include/linux/list.h
> @@ -32,7 +32,7 @@
>   */
>  static inline void INIT_LIST_HEAD(struct list_head *list)
>  {
> -	WRITE_ONCE(list->next, list);
> +	list->next = list;
>  	list->prev = list;
>  }
>  
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 09/21] list: Remove unnecessary WRITE_ONCE() from hlist_bl_add_before()
  2020-03-24 15:36 ` [RFC PATCH 09/21] list: Remove unnecessary WRITE_ONCE() from hlist_bl_add_before() Will Deacon
@ 2020-03-30 23:30   ` Paul E. McKenney
  2020-03-31 12:37     ` Will Deacon
  0 siblings, 1 reply; 58+ messages in thread
From: Paul E. McKenney @ 2020-03-30 23:30 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Peter Zijlstra, Thomas Gleixner, kernel-team,
	kernel-hardening

On Tue, Mar 24, 2020 at 03:36:31PM +0000, Will Deacon wrote:
> There is no need to use WRITE_ONCE() when inserting into a non-RCU
> protected list.
> 
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Will Deacon <will@kernel.org>

OK, I will bite...  Why "unsigned long" instead of "uintptr_t"?

Not that it matters in the context of the Linux kernel, so:

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

							Thanx, Paul

> ---
>  include/linux/list_bl.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/list_bl.h b/include/linux/list_bl.h
> index 1804fdb84dda..c93e7e3aa8fc 100644
> --- a/include/linux/list_bl.h
> +++ b/include/linux/list_bl.h
> @@ -91,15 +91,15 @@ static inline void hlist_bl_add_before(struct hlist_bl_node *n,
>  				       struct hlist_bl_node *next)
>  {
>  	struct hlist_bl_node **pprev = next->pprev;
> +	unsigned long val;
>  
>  	n->pprev = pprev;
>  	n->next = next;
>  	next->pprev = &n->next;
>  
>  	/* pprev may be `first`, so be careful not to lose the lock bit */
> -	WRITE_ONCE(*pprev,
> -		   (struct hlist_bl_node *)
> -			((uintptr_t)n | ((uintptr_t)*pprev & LIST_BL_LOCKMASK)));
> +	val = (unsigned long)n | ((unsigned long)*pprev & LIST_BL_LOCKMASK);
> +	*pprev = (struct hlist_bl_node *)val;
>  }
>  
>  static inline void hlist_bl_add_behind(struct hlist_bl_node *n,
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 12/21] list: Poison ->next pointer for non-RCU deletion of 'hlist_nulls_node'
  2020-03-24 15:36 ` [RFC PATCH 12/21] list: Poison ->next pointer for non-RCU deletion of 'hlist_nulls_node' Will Deacon
@ 2020-03-30 23:32   ` Paul E. McKenney
  0 siblings, 0 replies; 58+ messages in thread
From: Paul E. McKenney @ 2020-03-30 23:32 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Peter Zijlstra, Thomas Gleixner, kernel-team,
	kernel-hardening

On Tue, Mar 24, 2020 at 03:36:34PM +0000, Will Deacon wrote:
> hlist_nulls_del() is used for non-RCU deletion of an 'hlist_nulls_node'
> and so we can safely poison the ->next pointer without having to worry
> about concurrent readers, just like we do for other non-RCU list
> deletion primitives
> 
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Will Deacon <will@kernel.org>

Agreed, any code having difficulty with this change should use instead
hlist_nulls_del_rcu().

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

> ---
>  include/linux/list_nulls.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/linux/list_nulls.h b/include/linux/list_nulls.h
> index fd154ceb5b0d..48f33ae16381 100644
> --- a/include/linux/list_nulls.h
> +++ b/include/linux/list_nulls.h
> @@ -99,6 +99,7 @@ static inline void __hlist_nulls_del(struct hlist_nulls_node *n)
>  static inline void hlist_nulls_del(struct hlist_nulls_node *n)
>  {
>  	__hlist_nulls_del(n);
> +	n->next = LIST_POISON1;
>  	n->pprev = LIST_POISON2;
>  }
>  
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 09/21] list: Remove unnecessary WRITE_ONCE() from hlist_bl_add_before()
  2020-03-30 23:30   ` Paul E. McKenney
@ 2020-03-31 12:37     ` Will Deacon
  0 siblings, 0 replies; 58+ messages in thread
From: Will Deacon @ 2020-03-31 12:37 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Peter Zijlstra, Thomas Gleixner, kernel-team,
	kernel-hardening

On Mon, Mar 30, 2020 at 04:30:20PM -0700, Paul E. McKenney wrote:
> On Tue, Mar 24, 2020 at 03:36:31PM +0000, Will Deacon wrote:
> > There is no need to use WRITE_ONCE() when inserting into a non-RCU
> > protected list.
> > 
> > Cc: Paul E. McKenney <paulmck@kernel.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Signed-off-by: Will Deacon <will@kernel.org>
> 
> OK, I will bite...  Why "unsigned long" instead of "uintptr_t"?

I just did that for consistency with the rest of the file, e.g.
hlist_bl_first(), hlist_bl_set_first(), hlist_bl_empty() and
__hlist_bl_del() all cast to 'unsigned long', yet only
hlist_bl_add_before() was using 'uintptr_t'

> Not that it matters in the context of the Linux kernel, so:
> 
> Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

Thanks, Paul!

Will

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race()
  2020-03-24 16:23   ` Marco Elver
  2020-03-24 21:33     ` Will Deacon
@ 2020-03-31 13:10     ` Will Deacon
  2020-04-01  6:34       ` Marco Elver
  2020-05-08 13:46       ` [tip: locking/kcsan] kcsan: Change data_race() to no longer require marking racing accesses tip-bot2 for Marco Elver
  1 sibling, 2 replies; 58+ messages in thread
From: Will Deacon @ 2020-03-31 13:10 UTC (permalink / raw)
  To: Marco Elver
  Cc: LKML, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Paul E . McKenney, Peter Zijlstra, Thomas Gleixner, kernel-team,
	kernel-hardening

On Tue, Mar 24, 2020 at 05:23:30PM +0100, Marco Elver wrote:
> On Tue, 24 Mar 2020 at 16:37, Will Deacon <will@kernel.org> wrote:
> > Some list predicates can be used locklessly even with the non-RCU list
> > implementations, since they effectively boil down to a test against
> > NULL. For example, checking whether or not a list is empty is safe even
> > in the presence of a concurrent, tearing write to the list head pointer.
> > Similarly, checking whether or not an hlist node has been hashed is safe
> > as well.
> >
> > Annotate these lockless list predicates with data_race() and READ_ONCE()
> > so that KCSAN and the compiler are aware of what's going on. The writer
> > side can then avoid having to use WRITE_ONCE() in the non-RCU
> > implementation.
> >
> > Cc: Paul E. McKenney <paulmck@kernel.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Marco Elver <elver@google.com>
> > Signed-off-by: Will Deacon <will@kernel.org>
> > ---
> >  include/linux/list.h       | 10 +++++-----
> >  include/linux/list_bl.h    |  5 +++--
> >  include/linux/list_nulls.h |  6 +++---
> >  include/linux/llist.h      |  2 +-
> >  4 files changed, 12 insertions(+), 11 deletions(-)
> >
> > diff --git a/include/linux/list.h b/include/linux/list.h
> > index 4fed5a0f9b77..4d9f5f9ed1a8 100644
> > --- a/include/linux/list.h
> > +++ b/include/linux/list.h
> > @@ -279,7 +279,7 @@ static inline int list_is_last(const struct list_head *list,
> >   */
> >  static inline int list_empty(const struct list_head *head)
> >  {
> > -       return READ_ONCE(head->next) == head;
> > +       return data_race(READ_ONCE(head->next) == head);
> 
> Double-marking should never be necessary, at least if you want to make
> KCSAN happy. From what I gather there is an unmarked write somewhere,
> correct? In that case, KCSAN will still complain because if it sees a
> race between this read and the other write, then at least one is still
> plain (the write).

Ok, then I should drop the data_race() annotation and stick to READ_ONCE(),
I think (but see below).

> Then, my suggestion would be to mark the write with data_race() and
> just leave this as a READ_ONCE(). Having a data_race() somewhere only
> makes KCSAN stop reporting the race if the paired access is also
> marked (be it with data_race() or _ONCE, etc.).

The problem with taking that approach is that it ends up much of the
list implementation annotated with either WRITE_ONCE() or data_race(),
meaning that concurrent, racy list operations will no longer be reported
by KCSAN. I think that's a pretty big deal and I'm strongly against
annotating the internals of library code such as this because it means
that buggy callers will largely go undetected.

The situation we have here is that some calls, e.g. hlist_empty() are
safe even in the presence of a racy write and I'd like to suppress KCSAN
reports without annotating the writes at all.

> Alternatively, if marking the write is impossible, you can surround
> the access with kcsan_disable_current()/kcsan_enable_current(). Or, as
> a last resort, just leaving as-is is fine too, because KCSAN's default
> config (still) has KCSAN_ASSUME_PLAIN_WRITES_ATOMIC selected.

Hmm, I suppose some bright spark will want to change the default at the some
point though, no? ;) I'll look at using
kcsan_disable_current()/kcsan_enable_current(), thanks.

Will

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 08/21] Revert "list: Use WRITE_ONCE() when initializing list_head structures"
  2020-03-30 23:25   ` Paul E. McKenney
@ 2020-03-31 13:11     ` Will Deacon
  2020-03-31 13:47       ` Paul E. McKenney
  0 siblings, 1 reply; 58+ messages in thread
From: Will Deacon @ 2020-03-31 13:11 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Peter Zijlstra, Thomas Gleixner, kernel-team,
	kernel-hardening

On Mon, Mar 30, 2020 at 04:25:05PM -0700, Paul E. McKenney wrote:
> On Tue, Mar 24, 2020 at 03:36:30PM +0000, Will Deacon wrote:
> > This reverts commit 2f073848c3cc8aff2655ab7c46d8c0de90cf4e50.
> > 
> > There is no need to use WRITE_ONCE() to initialise a non-RCU 'list_head'.
> > 
> > Cc: Paul E. McKenney <paulmck@kernel.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Signed-off-by: Will Deacon <will@kernel.org>
> 
> And attention to lockless uses of list_empty() here, correct?
> 
> Depending on the outcome of discussions on 3/21, I should have added in
> all three cases.

Yes, patch 3 is where this will get sorted. It looks like we'll have to
disable KCSAN around the READ_ONCE() over there, but I also need to finish
wrapping my head around list_empty_careful() because I'm deeply suspicious!

Will

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 08/21] Revert "list: Use WRITE_ONCE() when initializing list_head structures"
  2020-03-31 13:11     ` Will Deacon
@ 2020-03-31 13:47       ` Paul E. McKenney
  0 siblings, 0 replies; 58+ messages in thread
From: Paul E. McKenney @ 2020-03-31 13:47 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-kernel, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Marco Elver, Peter Zijlstra, Thomas Gleixner, kernel-team,
	kernel-hardening

On Tue, Mar 31, 2020 at 02:11:54PM +0100, Will Deacon wrote:
> On Mon, Mar 30, 2020 at 04:25:05PM -0700, Paul E. McKenney wrote:
> > On Tue, Mar 24, 2020 at 03:36:30PM +0000, Will Deacon wrote:
> > > This reverts commit 2f073848c3cc8aff2655ab7c46d8c0de90cf4e50.
> > > 
> > > There is no need to use WRITE_ONCE() to initialise a non-RCU 'list_head'.
> > > 
> > > Cc: Paul E. McKenney <paulmck@kernel.org>
> > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > Signed-off-by: Will Deacon <will@kernel.org>
> > 
> > And attention to lockless uses of list_empty() here, correct?
> > 
> > Depending on the outcome of discussions on 3/21, I should have added in
> > all three cases.
> 
> Yes, patch 3 is where this will get sorted. It looks like we'll have to
> disable KCSAN around the READ_ONCE() over there, but I also need to finish
> wrapping my head around list_empty_careful() because I'm deeply suspicious!

At the very least, it does have the disadvantage of touching an additional
cache line, and up to two additional cache lines in the non-empty case.  :-(

							Thanx, Paul

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race()
  2020-03-31 13:10     ` Will Deacon
@ 2020-04-01  6:34       ` Marco Elver
  2020-04-01  8:40         ` Will Deacon
  2020-05-08 13:46       ` [tip: locking/kcsan] kcsan: Change data_race() to no longer require marking racing accesses tip-bot2 for Marco Elver
  1 sibling, 1 reply; 58+ messages in thread
From: Marco Elver @ 2020-04-01  6:34 UTC (permalink / raw)
  To: Will Deacon
  Cc: LKML, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Paul E . McKenney, Peter Zijlstra, Thomas Gleixner, kernel-team,
	kernel-hardening

On Tue, 31 Mar 2020 at 15:10, Will Deacon <will@kernel.org> wrote:
>
> On Tue, Mar 24, 2020 at 05:23:30PM +0100, Marco Elver wrote:
> > On Tue, 24 Mar 2020 at 16:37, Will Deacon <will@kernel.org> wrote:
> > > Some list predicates can be used locklessly even with the non-RCU list
> > > implementations, since they effectively boil down to a test against
> > > NULL. For example, checking whether or not a list is empty is safe even
> > > in the presence of a concurrent, tearing write to the list head pointer.
> > > Similarly, checking whether or not an hlist node has been hashed is safe
> > > as well.
> > >
> > > Annotate these lockless list predicates with data_race() and READ_ONCE()
> > > so that KCSAN and the compiler are aware of what's going on. The writer
> > > side can then avoid having to use WRITE_ONCE() in the non-RCU
> > > implementation.
> > >
> > > Cc: Paul E. McKenney <paulmck@kernel.org>
> > > Cc: Peter Zijlstra <peterz@infradead.org>
> > > Cc: Marco Elver <elver@google.com>
> > > Signed-off-by: Will Deacon <will@kernel.org>
> > > ---
> > >  include/linux/list.h       | 10 +++++-----
> > >  include/linux/list_bl.h    |  5 +++--
> > >  include/linux/list_nulls.h |  6 +++---
> > >  include/linux/llist.h      |  2 +-
> > >  4 files changed, 12 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/include/linux/list.h b/include/linux/list.h
> > > index 4fed5a0f9b77..4d9f5f9ed1a8 100644
> > > --- a/include/linux/list.h
> > > +++ b/include/linux/list.h
> > > @@ -279,7 +279,7 @@ static inline int list_is_last(const struct list_head *list,
> > >   */
> > >  static inline int list_empty(const struct list_head *head)
> > >  {
> > > -       return READ_ONCE(head->next) == head;
> > > +       return data_race(READ_ONCE(head->next) == head);
> >
> > Double-marking should never be necessary, at least if you want to make
> > KCSAN happy. From what I gather there is an unmarked write somewhere,
> > correct? In that case, KCSAN will still complain because if it sees a
> > race between this read and the other write, then at least one is still
> > plain (the write).
>
> Ok, then I should drop the data_race() annotation and stick to READ_ONCE(),
> I think (but see below).
>
> > Then, my suggestion would be to mark the write with data_race() and
> > just leave this as a READ_ONCE(). Having a data_race() somewhere only
> > makes KCSAN stop reporting the race if the paired access is also
> > marked (be it with data_race() or _ONCE, etc.).
>
> The problem with taking that approach is that it ends up much of the
> list implementation annotated with either WRITE_ONCE() or data_race(),
> meaning that concurrent, racy list operations will no longer be reported
> by KCSAN. I think that's a pretty big deal and I'm strongly against
> annotating the internals of library code such as this because it means
> that buggy callers will largely go undetected.
>
> The situation we have here is that some calls, e.g. hlist_empty() are
> safe even in the presence of a racy write and I'd like to suppress KCSAN
> reports without annotating the writes at all.
>
> > Alternatively, if marking the write is impossible, you can surround
> > the access with kcsan_disable_current()/kcsan_enable_current(). Or, as
> > a last resort, just leaving as-is is fine too, because KCSAN's default
> > config (still) has KCSAN_ASSUME_PLAIN_WRITES_ATOMIC selected.
>
> Hmm, I suppose some bright spark will want to change the default at the some
> point though, no? ;) I'll look at using
> kcsan_disable_current()/kcsan_enable_current(), thanks.

I think this will come up again (it did already come up in some other
patch I reviewed, and Paul also mentioned it), so it seems best to
change data_race() to match the intuitive semantics of just completely
ignoring the access marked with it. I.e. marking accesses racing with
accesses marked with data_race() is now optional:
  https://lkml.kernel.org/r/20200331193233.15180-1-elver@google.com

In which case, the original patch you had here works just fine.

Thanks,
-- Marco

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race()
  2020-04-01  6:34       ` Marco Elver
@ 2020-04-01  8:40         ` Will Deacon
  0 siblings, 0 replies; 58+ messages in thread
From: Will Deacon @ 2020-04-01  8:40 UTC (permalink / raw)
  To: Marco Elver
  Cc: LKML, Eric Dumazet, Jann Horn, Kees Cook, Maddie Stone,
	Paul E . McKenney, Peter Zijlstra, Thomas Gleixner, kernel-team,
	kernel-hardening

On Wed, Apr 01, 2020 at 08:34:36AM +0200, Marco Elver wrote:
> On Tue, 31 Mar 2020 at 15:10, Will Deacon <will@kernel.org> wrote:
> > On Tue, Mar 24, 2020 at 05:23:30PM +0100, Marco Elver wrote:
> > > Then, my suggestion would be to mark the write with data_race() and
> > > just leave this as a READ_ONCE(). Having a data_race() somewhere only
> > > makes KCSAN stop reporting the race if the paired access is also
> > > marked (be it with data_race() or _ONCE, etc.).
> >
> > The problem with taking that approach is that it ends up much of the
> > list implementation annotated with either WRITE_ONCE() or data_race(),
> > meaning that concurrent, racy list operations will no longer be reported
> > by KCSAN. I think that's a pretty big deal and I'm strongly against
> > annotating the internals of library code such as this because it means
> > that buggy callers will largely go undetected.
> >
> > The situation we have here is that some calls, e.g. hlist_empty() are
> > safe even in the presence of a racy write and I'd like to suppress KCSAN
> > reports without annotating the writes at all.
> >
> > > Alternatively, if marking the write is impossible, you can surround
> > > the access with kcsan_disable_current()/kcsan_enable_current(). Or, as
> > > a last resort, just leaving as-is is fine too, because KCSAN's default
> > > config (still) has KCSAN_ASSUME_PLAIN_WRITES_ATOMIC selected.
> >
> > Hmm, I suppose some bright spark will want to change the default at the some
> > point though, no? ;) I'll look at using
> > kcsan_disable_current()/kcsan_enable_current(), thanks.
> 
> I think this will come up again (it did already come up in some other
> patch I reviewed, and Paul also mentioned it), so it seems best to
> change data_race() to match the intuitive semantics of just completely
> ignoring the access marked with it. I.e. marking accesses racing with
> accesses marked with data_race() is now optional:
>   https://lkml.kernel.org/r/20200331193233.15180-1-elver@google.com

/me goes look. Thanks!

> In which case, the original patch you had here works just fine.

Ah yes, so now data_race(READ_ONCE(...)) does make sense as a combination.
It's tempting to wrap that up as an accessor, but actually forcing people to
spell it out might not be a bad thing after all.

Will

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race()
  2020-03-30 23:13         ` Paul E. McKenney
@ 2020-04-24 17:39           ` Will Deacon
  2020-04-27 19:24             ` Paul E. McKenney
  0 siblings, 1 reply; 58+ messages in thread
From: Will Deacon @ 2020-04-24 17:39 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Jann Horn, Peter Zijlstra, kernel list, Eric Dumazet, Kees Cook,
	Maddie Stone, Marco Elver, Thomas Gleixner, kernel-team,
	Kernel Hardening, Oleg Nesterov

On Mon, Mar 30, 2020 at 04:13:15PM -0700, Paul E. McKenney wrote:
> On Tue, Mar 24, 2020 at 09:32:01PM +0000, Will Deacon wrote:
> > [mutt crashed while I was sending this; apologies if you receive it twice]
> > 
> > On Tue, Mar 24, 2020 at 05:56:15PM +0100, Jann Horn wrote:
> > > On Tue, Mar 24, 2020 at 5:51 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > > > On Tue, Mar 24, 2020 at 03:36:25PM +0000, Will Deacon wrote:
> > > > > diff --git a/include/linux/list.h b/include/linux/list.h
> > > > > index 4fed5a0f9b77..4d9f5f9ed1a8 100644
> > > > > --- a/include/linux/list.h
> > > > > +++ b/include/linux/list.h
> > > > > @@ -279,7 +279,7 @@ static inline int list_is_last(const struct list_head *list,
> > > > >   */
> > > > >  static inline int list_empty(const struct list_head *head)
> > > > >  {
> > > > > -     return READ_ONCE(head->next) == head;
> > > > > +     return data_race(READ_ONCE(head->next) == head);
> > > > >  }
> > > >
> > > > list_empty() isn't lockless safe, that's what we have
> > > > list_empty_careful() for.
> > > 
> > > That thing looks like it could also use some READ_ONCE() sprinkled in...
> > 
> > Crikey, how did I miss that? I need to spend some time understanding the
> > ordering there.
> > 
> > So it sounds like the KCSAN splats relating to list_empty() and loosely
> > referred to by 1c97be677f72 ("list: Use WRITE_ONCE() when adding to lists
> > and hlists") are indicative of real bugs and we should actually restore
> > list_empty() to its former glory prior to 1658d35ead5d ("list: Use
> > READ_ONCE() when testing for empty lists"). Alternatively, assuming
> > list_empty_careful() does what it says on the tin, we could just make that
> > the default.
> 
> The list_empty_careful() function (suitably annotated) returns false if
> the list is non-empty, including when it is in the process of becoming
> either empty or non-empty.  It would be fine for the lockless use cases
> I have come across.

Hmm, I had a look at the implementation and I'm not at all convinced that
it's correct. First of all, the comment above it states:

 * NOTE: using list_empty_careful() without synchronization
 * can only be safe if the only activity that can happen
 * to the list entry is list_del_init(). Eg. it cannot be used
 * if another CPU could re-list_add() it.

but it seems that people disregard this note and instead use it as a
general-purpose lockless test, taking a lock and rechecking if it returns
non-empty. It would also mean we'd have to keep the WRITE_ONCE() in
INIT_LIST_HEAD, which is something that I've been trying to remove.

In the face of something like a concurrent list_add(); list_add_tail()
sequence, then the tearing writes to the head->{prev,next} pointers could
cause list_empty_careful() to indicate that the list is momentarily empty.

I've started looking at whether we can use a NULL next pointer to indicate
an empty list, which might allow us to kill the __list_del_clearprev() hack
at the same time, but I've not found enough time to really get my teeth into
it yet.

Will

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race()
  2020-04-24 17:39           ` Will Deacon
@ 2020-04-27 19:24             ` Paul E. McKenney
  0 siblings, 0 replies; 58+ messages in thread
From: Paul E. McKenney @ 2020-04-27 19:24 UTC (permalink / raw)
  To: Will Deacon
  Cc: Jann Horn, Peter Zijlstra, kernel list, Eric Dumazet, Kees Cook,
	Maddie Stone, Marco Elver, Thomas Gleixner, kernel-team,
	Kernel Hardening, Oleg Nesterov

On Fri, Apr 24, 2020 at 06:39:33PM +0100, Will Deacon wrote:
> On Mon, Mar 30, 2020 at 04:13:15PM -0700, Paul E. McKenney wrote:
> > On Tue, Mar 24, 2020 at 09:32:01PM +0000, Will Deacon wrote:
> > > [mutt crashed while I was sending this; apologies if you receive it twice]
> > > 
> > > On Tue, Mar 24, 2020 at 05:56:15PM +0100, Jann Horn wrote:
> > > > On Tue, Mar 24, 2020 at 5:51 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > > > > On Tue, Mar 24, 2020 at 03:36:25PM +0000, Will Deacon wrote:
> > > > > > diff --git a/include/linux/list.h b/include/linux/list.h
> > > > > > index 4fed5a0f9b77..4d9f5f9ed1a8 100644
> > > > > > --- a/include/linux/list.h
> > > > > > +++ b/include/linux/list.h
> > > > > > @@ -279,7 +279,7 @@ static inline int list_is_last(const struct list_head *list,
> > > > > >   */
> > > > > >  static inline int list_empty(const struct list_head *head)
> > > > > >  {
> > > > > > -     return READ_ONCE(head->next) == head;
> > > > > > +     return data_race(READ_ONCE(head->next) == head);
> > > > > >  }
> > > > >
> > > > > list_empty() isn't lockless safe, that's what we have
> > > > > list_empty_careful() for.
> > > > 
> > > > That thing looks like it could also use some READ_ONCE() sprinkled in...
> > > 
> > > Crikey, how did I miss that? I need to spend some time understanding the
> > > ordering there.
> > > 
> > > So it sounds like the KCSAN splats relating to list_empty() and loosely
> > > referred to by 1c97be677f72 ("list: Use WRITE_ONCE() when adding to lists
> > > and hlists") are indicative of real bugs and we should actually restore
> > > list_empty() to its former glory prior to 1658d35ead5d ("list: Use
> > > READ_ONCE() when testing for empty lists"). Alternatively, assuming
> > > list_empty_careful() does what it says on the tin, we could just make that
> > > the default.
> > 
> > The list_empty_careful() function (suitably annotated) returns false if
> > the list is non-empty, including when it is in the process of becoming
> > either empty or non-empty.  It would be fine for the lockless use cases
> > I have come across.
> 
> Hmm, I had a look at the implementation and I'm not at all convinced that
> it's correct. First of all, the comment above it states:
> 
>  * NOTE: using list_empty_careful() without synchronization
>  * can only be safe if the only activity that can happen
>  * to the list entry is list_del_init(). Eg. it cannot be used
>  * if another CPU could re-list_add() it.

Huh.  This thing is unchanged since 2.6.12-rc2, back in 2005:

static inline int list_empty_careful(const struct list_head *head)
{
	struct list_head *next = head->next;
	return (next == head) && (next == head->prev);
}

I can imagine compiler value-caching optimizations that would cause
trouble, for example, if a previous obsolete fetch from head->prev was
lying around in a register, causing this function to say "not empty" when
it was in fact empty.  Of course, if obsolete values for both head->next
and head->prev were lying around, pretty much anything could happen.

> but it seems that people disregard this note and instead use it as a
> general-purpose lockless test, taking a lock and rechecking if it returns
> non-empty. It would also mean we'd have to keep the WRITE_ONCE() in
> INIT_LIST_HEAD, which is something that I've been trying to remove.
> 
> In the face of something like a concurrent list_add(); list_add_tail()
> sequence, then the tearing writes to the head->{prev,next} pointers could
> cause list_empty_careful() to indicate that the list is momentarily empty.
> 
> I've started looking at whether we can use a NULL next pointer to indicate
> an empty list, which might allow us to kill the __list_del_clearprev() hack
> at the same time, but I've not found enough time to really get my teeth into
> it yet.

In the delete-only case, I kind of get it, other than the potential for
optimization.  Once the list becomes empty, it will forever remain empty.
And the additional test of head->prev avoids this returning true while the
deletion is half done (again, aside from the potential for optimization).

If insertions are allowed, the thing I haven't quite figured out yet is
what is being gained by the additional check of head->prev.  After all,
if updates are not excluded, the return value can become obsolete
immediately anyhow.  Yes, it could be used as a heuristic, but it could
report empty immediately before a list_add(), so there would need to
either be a careful wakeup protocol or a periodic poll of the list.

Or am I missing a trick here?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [tip: locking/kcsan] kcsan: Change data_race() to no longer require marking racing accesses
  2020-03-31 13:10     ` Will Deacon
  2020-04-01  6:34       ` Marco Elver
@ 2020-05-08 13:46       ` tip-bot2 for Marco Elver
  1 sibling, 0 replies; 58+ messages in thread
From: tip-bot2 for Marco Elver @ 2020-05-08 13:46 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Paul E. McKenney, Will Deacon, Qian Cai, Marco Elver, x86, LKML

The following commit has been merged into the locking/kcsan branch of tip:

Commit-ID:     d071e91361bbfef524ae8abf7e560fb294d0ad64
Gitweb:        https://git.kernel.org/tip/d071e91361bbfef524ae8abf7e560fb294d0ad64
Author:        Marco Elver <elver@google.com>
AuthorDate:    Tue, 31 Mar 2020 21:32:33 +02:00
Committer:     Paul E. McKenney <paulmck@kernel.org>
CommitterDate: Mon, 13 Apr 2020 17:18:15 -07:00

kcsan: Change data_race() to no longer require marking racing accesses

Thus far, accesses marked with data_race() would still require the
racing access to be marked in some way (be it with READ_ONCE(),
WRITE_ONCE(), or data_race() itself), as otherwise KCSAN would still
report a data race.  This requirement, however, seems to be unintuitive,
and some valid use-cases demand *not* marking other accesses, as it
might hide more serious bugs (e.g. diagnostic reads).

Therefore, this commit changes data_race() to no longer require marking
racing accesses (although it's still recommended if possible).

The alternative would have been introducing another variant of
data_race(), however, since usage of data_race() already needs to be
carefully reasoned about, distinguishing between these cases likely adds
more complexity in the wrong place.

Link: https://lkml.kernel.org/r/20200331131002.GA30975@willie-the-truck
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Qian Cai <cai@lca.pw>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 include/linux/compiler.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index f504ede..1729bd1 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -326,9 +326,9 @@ unsigned long read_word_at_a_time(const void *addr)
 #define data_race(expr)                                                        \
 	({                                                                     \
 		typeof(({ expr; })) __val;                                     \
-		kcsan_nestable_atomic_begin();                                 \
+		kcsan_disable_current();                                       \
 		__val = ({ expr; });                                           \
-		kcsan_nestable_atomic_end();                                   \
+		kcsan_enable_current();                                        \
 		__val;                                                         \
 	})
 #else

^ permalink raw reply related	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2020-05-08 13:47 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-24 15:36 [RFC PATCH 00/21] Improve list integrity checking Will Deacon
2020-03-24 15:36 ` [RFC PATCH 01/21] list: Remove hlist_unhashed_lockless() Will Deacon
2020-03-24 16:27   ` Greg KH
2020-03-30 23:05   ` Paul E. McKenney
2020-03-24 15:36 ` [RFC PATCH 02/21] list: Remove hlist_nulls_unhashed_lockless() Will Deacon
2020-03-24 16:27   ` Greg KH
2020-03-30 23:07   ` Paul E. McKenney
2020-03-24 15:36 ` [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race() Will Deacon
2020-03-24 16:20   ` Jann Horn
2020-03-24 16:26     ` Greg KH
2020-03-24 16:38       ` Jann Horn
2020-03-24 16:59         ` Greg KH
2020-03-24 18:22           ` Jann Horn
2020-03-24 16:23   ` Marco Elver
2020-03-24 21:33     ` Will Deacon
2020-03-31 13:10     ` Will Deacon
2020-04-01  6:34       ` Marco Elver
2020-04-01  8:40         ` Will Deacon
2020-05-08 13:46       ` [tip: locking/kcsan] kcsan: Change data_race() to no longer require marking racing accesses tip-bot2 for Marco Elver
2020-03-24 16:51   ` [RFC PATCH 03/21] list: Annotate lockless list primitives with data_race() Peter Zijlstra
2020-03-24 16:56     ` Jann Horn
2020-03-24 21:32       ` Will Deacon
2020-03-30 23:13         ` Paul E. McKenney
2020-04-24 17:39           ` Will Deacon
2020-04-27 19:24             ` Paul E. McKenney
2020-03-24 15:36 ` [RFC PATCH 04/21] timers: Use hlist_unhashed() instead of open-coding in timer_pending() Will Deacon
2020-03-24 16:30   ` Greg KH
2020-03-24 15:36 ` [RFC PATCH 05/21] list: Comment missing WRITE_ONCE() in __list_del() Will Deacon
2020-03-30 23:14   ` Paul E. McKenney
2020-03-24 15:36 ` [RFC PATCH 06/21] list: Remove superfluous WRITE_ONCE() from hlist_nulls implementation Will Deacon
2020-03-30 23:21   ` Paul E. McKenney
2020-03-24 15:36 ` [RFC PATCH 07/21] Revert "list: Use WRITE_ONCE() when adding to lists and hlists" Will Deacon
2020-03-30 23:19   ` Paul E. McKenney
2020-03-24 15:36 ` [RFC PATCH 08/21] Revert "list: Use WRITE_ONCE() when initializing list_head structures" Will Deacon
2020-03-30 23:25   ` Paul E. McKenney
2020-03-31 13:11     ` Will Deacon
2020-03-31 13:47       ` Paul E. McKenney
2020-03-24 15:36 ` [RFC PATCH 09/21] list: Remove unnecessary WRITE_ONCE() from hlist_bl_add_before() Will Deacon
2020-03-30 23:30   ` Paul E. McKenney
2020-03-31 12:37     ` Will Deacon
2020-03-24 15:36 ` [RFC PATCH 10/21] kernel-hacking: Make DEBUG_{LIST,PLIST,SG,NOTIFIERS} non-debug options Will Deacon
2020-03-24 16:42   ` Greg KH
2020-03-24 15:36 ` [RFC PATCH 11/21] list: Add integrity checking to hlist implementation Will Deacon
2020-03-24 15:36 ` [RFC PATCH 12/21] list: Poison ->next pointer for non-RCU deletion of 'hlist_nulls_node' Will Deacon
2020-03-30 23:32   ` Paul E. McKenney
2020-03-24 15:36 ` [RFC PATCH 13/21] list: Add integrity checking to hlist_nulls implementation Will Deacon
2020-03-24 15:36 ` [RFC PATCH 14/21] plist: Use CHECK_DATA_CORRUPTION instead of explicit {BUG,WARN}_ON() Will Deacon
2020-03-24 16:42   ` Greg KH
2020-03-24 15:36 ` [RFC PATCH 15/21] list_bl: Use CHECK_DATA_CORRUPTION instead of custom BUG_ON() wrapper Will Deacon
2020-03-24 15:36 ` [RFC PATCH 16/21] list_bl: Extend integrity checking in deletion routines Will Deacon
2020-03-24 15:36 ` [RFC PATCH 17/21] linux/bit_spinlock.h: Include linux/processor.h Will Deacon
2020-03-24 16:28   ` Greg KH
2020-03-24 21:08     ` Will Deacon
2020-03-24 15:36 ` [RFC PATCH 18/21] list_bl: Move integrity checking out of line Will Deacon
2020-03-24 15:36 ` [RFC PATCH 19/21] list_bl: Extend integrity checking to cover the same cases as 'hlist' Will Deacon
2020-03-24 15:36 ` [RFC PATCH 20/21] list: Format CHECK_DATA_CORRUPTION error messages consistently Will Deacon
2020-03-24 16:40   ` Greg KH
2020-03-24 15:36 ` [RFC PATCH 21/21] lkdtm: Extend list corruption checks Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).