linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] lockdep enhancements for bcachefs
@ 2023-11-22 23:51 Kent Overstreet
  2023-11-22 23:51 ` [PATCH 1/6] locking/lockdep: lock_class_is_held() Kent Overstreet
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Kent Overstreet @ 2023-11-22 23:51 UTC (permalink / raw)
  To: linux-kernel, linux-bcachefs, peterz
  Cc: boqun.feng, longman, will, mingo, Kent Overstreet

This adds:
 - lockdep_set_no_check_recursion(), which allows lockdep checking for
   bcachefs six locks

bcachefs has a cycle detector for deadlock avoidance; there is no lock
ordering for lockdep to check w.r.t. other btree node locks. However, we
do want held btree node locks to be tracked, and lock ordering to be
checked w.r.t. other locks.

 - lock_class_is_held(), which we use for some extra locking assertion

We want to assert e.g. that we're not blocking on IO with btree node
locks held.

Kent Overstreet (6):
  locking/lockdep: lock_class_is_held()
  locking/lockdep: lockdep_set_no_check_recursion()
  bcachefs: Assert that btree node locks aren't being leaked
  bcachefs: Use lock_class_is_held() for btree locking asserts
  bcachefs: Check for btree locks held on transaction init
  bcachefs: Switch to lockdep_set_no_check_recursion()

 fs/bcachefs/btree_gc.c        |  3 +++
 fs/bcachefs/btree_iter.c      |  2 ++
 fs/bcachefs/btree_locking.c   | 14 ++++++++---
 fs/bcachefs/btree_types.h     |  1 +
 include/linux/lockdep.h       | 10 ++++++++
 include/linux/lockdep_types.h |  2 +-
 kernel/locking/lockdep.c      | 46 +++++++++++++++++++++++++++++++++++
 7 files changed, 73 insertions(+), 5 deletions(-)

-- 
2.42.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/6] locking/lockdep: lock_class_is_held()
  2023-11-22 23:51 [PATCH 0/6] lockdep enhancements for bcachefs Kent Overstreet
@ 2023-11-22 23:51 ` Kent Overstreet
  2023-11-22 23:51 ` [PATCH 2/6] locking/lockdep: lockdep_set_no_check_recursion() Kent Overstreet
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Kent Overstreet @ 2023-11-22 23:51 UTC (permalink / raw)
  To: linux-kernel, linux-bcachefs, peterz
  Cc: boqun.feng, longman, will, mingo, Kent Overstreet

This patch adds lock_class_is_held(), which can be used to assert that a
particular type of lock is not held.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
---
 include/linux/lockdep.h  |  4 ++++
 kernel/locking/lockdep.c | 20 ++++++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index dc2844b071c2..fc86557d2a21 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -344,6 +344,8 @@ extern void lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie);
 #define lockdep_repin_lock(l,c)	lock_repin_lock(&(l)->dep_map, (c))
 #define lockdep_unpin_lock(l,c)	lock_unpin_lock(&(l)->dep_map, (c))
 
+int lock_class_is_held(struct lock_class_key *key);
+
 /*
  * Must use lock_map_aquire_try() with override maps to avoid
  * lockdep thinking they participate in the block chain.
@@ -442,6 +444,8 @@ extern int lockdep_is_held(const void *);
 #define lockdep_repin_lock(l, c)		do { (void)(l); (void)(c); } while (0)
 #define lockdep_unpin_lock(l, c)		do { (void)(l); (void)(c); } while (0)
 
+static inline int lock_class_is_held(struct lock_class_key *key) { return 0; }
+
 #define DEFINE_WAIT_OVERRIDE_MAP(_name, _wait_type)	\
 	struct lockdep_map __maybe_unused _name = {}
 
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index e85b5ad3e206..38924d90da85 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -6599,6 +6599,26 @@ void debug_check_no_locks_held(void)
 }
 EXPORT_SYMBOL_GPL(debug_check_no_locks_held);
 
+#ifdef CONFIG_LOCKDEP
+int lock_class_is_held(struct lock_class_key *key)
+{
+	struct task_struct *curr = current;
+	struct held_lock *hlock;
+
+	if (unlikely(!debug_locks))
+		return 0;
+
+	for (hlock = curr->held_locks;
+	     hlock < curr->held_locks + curr->lockdep_depth;
+	     hlock++)
+		if (hlock->instance->key == key)
+			return 1;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(lock_class_is_held);
+#endif
+
 #ifdef __KERNEL__
 void debug_show_all_locks(void)
 {
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/6] locking/lockdep: lockdep_set_no_check_recursion()
  2023-11-22 23:51 [PATCH 0/6] lockdep enhancements for bcachefs Kent Overstreet
  2023-11-22 23:51 ` [PATCH 1/6] locking/lockdep: lock_class_is_held() Kent Overstreet
@ 2023-11-22 23:51 ` Kent Overstreet
  2023-11-24  9:58   ` Peter Zijlstra
  2023-11-22 23:51 ` [PATCH 3/6] bcachefs: Assert that btree node locks aren't being leaked Kent Overstreet
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 9+ messages in thread
From: Kent Overstreet @ 2023-11-22 23:51 UTC (permalink / raw)
  To: linux-kernel, linux-bcachefs, peterz
  Cc: boqun.feng, longman, will, mingo, Kent Overstreet

This adds a method to tell lockdep not to check lock ordering within a
lock class - but to still check lock ordering w.r.t. other lock types.

This is for bcachefs, where for btree node locks we have our own
deadlock avoidance strategy w.r.t. other btree node locks (cycle
detection), but we still want lockdep to check lock ordering w.r.t.
other lock types.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
---
 include/linux/lockdep.h       |  6 ++++++
 include/linux/lockdep_types.h |  2 +-
 kernel/locking/lockdep.c      | 26 ++++++++++++++++++++++++++
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index fc86557d2a21..1bc7eb7cb548 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -700,4 +700,10 @@ lockdep_rcu_suspicious(const char *file, const int line, const char *s)
 }
 #endif
 
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+void lockdep_set_no_check_recursion(struct lockdep_map *map);
+#else
+static inline void lockdep_set_no_check_recursion(struct lockdep_map *map) {}
+#endif
+
 #endif /* __LINUX_LOCKDEP_H */
diff --git a/include/linux/lockdep_types.h b/include/linux/lockdep_types.h
index 2ebc323d345a..aa6bddac2a64 100644
--- a/include/linux/lockdep_types.h
+++ b/include/linux/lockdep_types.h
@@ -137,7 +137,7 @@ struct lock_class {
 	u8				wait_type_inner;
 	u8				wait_type_outer;
 	u8				lock_type;
-	/* u8				hole; */
+	u8				no_check_recursion;
 
 #ifdef CONFIG_LOCK_STAT
 	unsigned long			contention_point[LOCKSTAT_POINTS];
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 38924d90da85..50a0a1ea960a 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -3048,6 +3048,9 @@ check_deadlock(struct task_struct *curr, struct held_lock *next)
 
 		class = hlock_class(prev);
 
+		if (class->no_check_recursion)
+			continue;
+
 		if (class->cmp_fn &&
 		    class->cmp_fn(prev->instance, next->instance) < 0)
 			continue;
@@ -3113,6 +3116,10 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
 		return 2;
 	}
 
+	if (hlock_class(prev) == hlock_class(next) &&
+	    hlock_class(prev)->no_check_recursion)
+		return 2;
+
 	if (prev->class_idx == next->class_idx) {
 		struct lock_class *class = hlock_class(prev);
 
@@ -6732,3 +6739,22 @@ void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
 	warn_rcu_exit(rcu);
 }
 EXPORT_SYMBOL_GPL(lockdep_rcu_suspicious);
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+void lockdep_set_no_check_recursion(struct lockdep_map *lock)
+{
+	struct lock_class *class = lock->class_cache[0];
+	unsigned long flags;
+
+	raw_local_irq_save(flags);
+	lockdep_recursion_inc();
+
+	if (!class)
+		class = register_lock_class(lock, 0, 0);
+	if (class)
+		class->no_check_recursion = true;
+	lockdep_recursion_finish();
+	raw_local_irq_restore(flags);
+}
+EXPORT_SYMBOL_GPL(lockdep_set_no_check_recursion);
+#endif
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/6] bcachefs: Assert that btree node locks aren't being leaked
  2023-11-22 23:51 [PATCH 0/6] lockdep enhancements for bcachefs Kent Overstreet
  2023-11-22 23:51 ` [PATCH 1/6] locking/lockdep: lock_class_is_held() Kent Overstreet
  2023-11-22 23:51 ` [PATCH 2/6] locking/lockdep: lockdep_set_no_check_recursion() Kent Overstreet
@ 2023-11-22 23:51 ` Kent Overstreet
  2023-11-22 23:51 ` [PATCH 4/6] bcachefs: Use lock_class_is_held() for btree locking asserts Kent Overstreet
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Kent Overstreet @ 2023-11-22 23:51 UTC (permalink / raw)
  To: linux-kernel, linux-bcachefs, peterz
  Cc: boqun.feng, longman, will, mingo, Kent Overstreet, Kent Overstreet

This asserts (when lockdep is enabled) that btree locks aren't held when
exiting a btree_trans.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
 fs/bcachefs/btree_gc.c      | 3 +++
 fs/bcachefs/btree_locking.c | 7 +++++++
 fs/bcachefs/btree_types.h   | 1 +
 3 files changed, 11 insertions(+)

diff --git a/fs/bcachefs/btree_gc.c b/fs/bcachefs/btree_gc.c
index 136df94e9f84..7e5d52f8ffd7 100644
--- a/fs/bcachefs/btree_gc.c
+++ b/fs/bcachefs/btree_gc.c
@@ -1087,6 +1087,9 @@ static int bch2_gc_btrees(struct bch_fs *c, bool initial, bool metadata_only)
 	unsigned i;
 	int ret = 0;
 
+	if (initial)
+		trans->is_initial_gc = true;
+
 	for (i = 0; i < BTREE_ID_NR; i++)
 		ids[i] = i;
 	bubble_sort(ids, BTREE_ID_NR, btree_id_gc_phase_cmp);
diff --git a/fs/bcachefs/btree_locking.c b/fs/bcachefs/btree_locking.c
index 6039278121dc..308c891ad3ca 100644
--- a/fs/bcachefs/btree_locking.c
+++ b/fs/bcachefs/btree_locking.c
@@ -751,6 +751,13 @@ void bch2_trans_unlock(struct btree_trans *trans)
 
 	trans_for_each_path(trans, path)
 		__bch2_btree_path_unlock(trans, path);
+
+	/*
+	 * bch2_gc_btree_init_recurse() doesn't use btree iterators for walking
+	 * btree nodes, it implements its own walking:
+	 */
+	if (!trans->is_initial_gc)
+		bch2_assert_btree_nodes_not_locked();
 }
 
 void bch2_trans_unlock_long(struct btree_trans *trans)
diff --git a/fs/bcachefs/btree_types.h b/fs/bcachefs/btree_types.h
index ca7526603d06..2326bceb34f8 100644
--- a/fs/bcachefs/btree_types.h
+++ b/fs/bcachefs/btree_types.h
@@ -399,6 +399,7 @@ struct btree_trans {
 	bool			memory_allocation_failure:1;
 	bool			journal_transaction_names:1;
 	bool			journal_replay_not_finished:1;
+	bool			is_initial_gc:1;
 	bool			notrace_relock_fail:1;
 	bool			write_locked:1;
 	enum bch_errcode	restarted:16;
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/6] bcachefs: Use lock_class_is_held() for btree locking asserts
  2023-11-22 23:51 [PATCH 0/6] lockdep enhancements for bcachefs Kent Overstreet
                   ` (2 preceding siblings ...)
  2023-11-22 23:51 ` [PATCH 3/6] bcachefs: Assert that btree node locks aren't being leaked Kent Overstreet
@ 2023-11-22 23:51 ` Kent Overstreet
  2023-11-22 23:51 ` [PATCH 5/6] bcachefs: Check for btree locks held on transaction init Kent Overstreet
  2023-11-22 23:51 ` [PATCH 6/6] bcachefs: Switch to lockdep_set_no_check_recursion() Kent Overstreet
  5 siblings, 0 replies; 9+ messages in thread
From: Kent Overstreet @ 2023-11-22 23:51 UTC (permalink / raw)
  To: linux-kernel, linux-bcachefs, peterz
  Cc: boqun.feng, longman, will, mingo, Kent Overstreet

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
 fs/bcachefs/btree_locking.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/fs/bcachefs/btree_locking.c b/fs/bcachefs/btree_locking.c
index 308c891ad3ca..ef84b0423deb 100644
--- a/fs/bcachefs/btree_locking.c
+++ b/fs/bcachefs/btree_locking.c
@@ -16,10 +16,7 @@ void bch2_btree_lock_init(struct btree_bkey_cached_common *b,
 #ifdef CONFIG_LOCKDEP
 void bch2_assert_btree_nodes_not_locked(void)
 {
-#if 0
-	//Re-enable when lock_class_is_held() is merged:
 	BUG_ON(lock_class_is_held(&bch2_btree_node_lock_key));
-#endif
 }
 #endif
 
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 5/6] bcachefs: Check for btree locks held on transaction init
  2023-11-22 23:51 [PATCH 0/6] lockdep enhancements for bcachefs Kent Overstreet
                   ` (3 preceding siblings ...)
  2023-11-22 23:51 ` [PATCH 4/6] bcachefs: Use lock_class_is_held() for btree locking asserts Kent Overstreet
@ 2023-11-22 23:51 ` Kent Overstreet
  2023-11-22 23:51 ` [PATCH 6/6] bcachefs: Switch to lockdep_set_no_check_recursion() Kent Overstreet
  5 siblings, 0 replies; 9+ messages in thread
From: Kent Overstreet @ 2023-11-22 23:51 UTC (permalink / raw)
  To: linux-kernel, linux-bcachefs, peterz
  Cc: boqun.feng, longman, will, mingo, Kent Overstreet, Kent Overstreet

From: Kent Overstreet <kent.overstreet@gmail.com>

Ideally we would disallow multiple btree_trans being initialized within
the same process - and hopefully we will at some point, the stack usage
is excessive - but for now there are a couple places where we do this:

 - transaction commit error path -> journal reclaim - btree key cache
   flush
 - move data path -> do_pending_writes -> write path -> bucket
   allocation (in the near future when bucket allocation switches to
   using a freespace btree)

In order to avoid deadlocking the first btree_trans must have been
unlocked with bch2_trans_unlock() before using the second btree_trans -
this patch adds an assertion to bch2_trans_init() that verifies that
this has been done when lockdep is enabled.

Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
 fs/bcachefs/btree_iter.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/bcachefs/btree_iter.c b/fs/bcachefs/btree_iter.c
index 1d734e297eb4..a52fd206f822 100644
--- a/fs/bcachefs/btree_iter.c
+++ b/fs/bcachefs/btree_iter.c
@@ -2854,6 +2854,8 @@ struct btree_trans *__bch2_trans_get(struct bch_fs *c, unsigned fn_idx)
 	struct btree_trans *trans;
 	struct btree_transaction_stats *s;
 
+	bch2_assert_btree_nodes_not_locked();
+
 	trans = bch2_trans_alloc(c);
 
 	memset(trans, 0, sizeof(*trans));
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 6/6] bcachefs: Switch to lockdep_set_no_check_recursion()
  2023-11-22 23:51 [PATCH 0/6] lockdep enhancements for bcachefs Kent Overstreet
                   ` (4 preceding siblings ...)
  2023-11-22 23:51 ` [PATCH 5/6] bcachefs: Check for btree locks held on transaction init Kent Overstreet
@ 2023-11-22 23:51 ` Kent Overstreet
  5 siblings, 0 replies; 9+ messages in thread
From: Kent Overstreet @ 2023-11-22 23:51 UTC (permalink / raw)
  To: linux-kernel, linux-bcachefs, peterz
  Cc: boqun.feng, longman, will, mingo, Kent Overstreet

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
---
 fs/bcachefs/btree_locking.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/bcachefs/btree_locking.c b/fs/bcachefs/btree_locking.c
index ef84b0423deb..59c57c585a4c 100644
--- a/fs/bcachefs/btree_locking.c
+++ b/fs/bcachefs/btree_locking.c
@@ -10,7 +10,9 @@ void bch2_btree_lock_init(struct btree_bkey_cached_common *b,
 			  enum six_lock_init_flags flags)
 {
 	__six_lock_init(&b->lock, "b->c.lock", &bch2_btree_node_lock_key, flags);
-	lockdep_set_novalidate_class(&b->lock);
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	lockdep_set_no_check_recursion(&b->lock.dep_map);
+#endif
 }
 
 #ifdef CONFIG_LOCKDEP
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/6] locking/lockdep: lockdep_set_no_check_recursion()
  2023-11-22 23:51 ` [PATCH 2/6] locking/lockdep: lockdep_set_no_check_recursion() Kent Overstreet
@ 2023-11-24  9:58   ` Peter Zijlstra
  2023-11-24 23:29     ` Kent Overstreet
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2023-11-24  9:58 UTC (permalink / raw)
  To: Kent Overstreet
  Cc: linux-kernel, linux-bcachefs, boqun.feng, longman, will, mingo

On Wed, Nov 22, 2023 at 06:51:09PM -0500, Kent Overstreet wrote:
> This adds a method to tell lockdep not to check lock ordering within a
> lock class - but to still check lock ordering w.r.t. other lock types.
> 
> This is for bcachefs, where for btree node locks we have our own
> deadlock avoidance strategy w.r.t. other btree node locks (cycle
> detection), but we still want lockdep to check lock ordering w.r.t.
> other lock types.

So earlier you added custom sort order.

Additionally there is the wound-wait mutexes that also have semantics
similar to what you describe.

Explain why you can't use either your own added feature or the existing
infrastructure to solve this?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/6] locking/lockdep: lockdep_set_no_check_recursion()
  2023-11-24  9:58   ` Peter Zijlstra
@ 2023-11-24 23:29     ` Kent Overstreet
  0 siblings, 0 replies; 9+ messages in thread
From: Kent Overstreet @ 2023-11-24 23:29 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linux-bcachefs, boqun.feng, longman, will, mingo

On Fri, Nov 24, 2023 at 10:58:04AM +0100, Peter Zijlstra wrote:
> On Wed, Nov 22, 2023 at 06:51:09PM -0500, Kent Overstreet wrote:
> > This adds a method to tell lockdep not to check lock ordering within a
> > lock class - but to still check lock ordering w.r.t. other lock types.
> > 
> > This is for bcachefs, where for btree node locks we have our own
> > deadlock avoidance strategy w.r.t. other btree node locks (cycle
> > detection), but we still want lockdep to check lock ordering w.r.t.
> > other lock types.
> 
> So earlier you added custom sort order.

That was never merged? I've been meaning to revisit that though and
convert some other things to it, the original patchset just converted
bcache, not bcachefs.

But it's not applicable here, bcachefs doesn't have lock ordering, it
has cycle detection.

> Additionally there is the wound-wait mutexes that also have semantics
> similar to what you describe.

Yeah, we talked about that as well. Wait/wound is similar in principle
but aborts much more frequently, since it's just comparing transaction
start time and not doing full cycle detection.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-11-24 23:29 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-22 23:51 [PATCH 0/6] lockdep enhancements for bcachefs Kent Overstreet
2023-11-22 23:51 ` [PATCH 1/6] locking/lockdep: lock_class_is_held() Kent Overstreet
2023-11-22 23:51 ` [PATCH 2/6] locking/lockdep: lockdep_set_no_check_recursion() Kent Overstreet
2023-11-24  9:58   ` Peter Zijlstra
2023-11-24 23:29     ` Kent Overstreet
2023-11-22 23:51 ` [PATCH 3/6] bcachefs: Assert that btree node locks aren't being leaked Kent Overstreet
2023-11-22 23:51 ` [PATCH 4/6] bcachefs: Use lock_class_is_held() for btree locking asserts Kent Overstreet
2023-11-22 23:51 ` [PATCH 5/6] bcachefs: Check for btree locks held on transaction init Kent Overstreet
2023-11-22 23:51 ` [PATCH 6/6] bcachefs: Switch to lockdep_set_no_check_recursion() Kent Overstreet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).