From: Boqun Feng <boqun.feng@gmail.com>
To: Qian Cai <cai@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>, Will Deacon <will@kernel.org>,
Jonathan Corbet <corbet@lwn.net>,
Waiman Long <longman@redhat.com>
Subject: Re: [RFC v7 11/19] lockdep: Fix recursive read lock related safe->unsafe detection
Date: Thu, 17 Sep 2020 00:14:04 +0800 [thread overview]
Message-ID: <20200916161404.GD127490@debian-boqun.qqnc3lrjykvubdpftowmye0fmh.lx.internal.cloudapp.net> (raw)
In-Reply-To: <20200916081046.GC127490@debian-boqun.qqnc3lrjykvubdpftowmye0fmh.lx.internal.cloudapp.net>
On Wed, Sep 16, 2020 at 04:10:46PM +0800, Boqun Feng wrote:
> On Tue, Sep 15, 2020 at 02:32:51PM -0400, Qian Cai wrote:
> > On Fri, 2020-08-07 at 15:42 +0800, Boqun Feng wrote:
> > > Currently, in safe->unsafe detection, lockdep misses the fact that a
> > > LOCK_ENABLED_IRQ_*_READ usage and a LOCK_USED_IN_IRQ_*_READ usage may
> > > cause deadlock too, for example:
> > >
> > > P1 P2
> > > <irq disabled>
> > > write_lock(l1); <irq enabled>
> > > read_lock(l2);
> > > write_lock(l2);
> > > <in irq>
> > > read_lock(l1);
> > >
> > > Actually, all of the following cases may cause deadlocks:
> > >
> > > LOCK_USED_IN_IRQ_* -> LOCK_ENABLED_IRQ_*
> > > LOCK_USED_IN_IRQ_*_READ -> LOCK_ENABLED_IRQ_*
> > > LOCK_USED_IN_IRQ_* -> LOCK_ENABLED_IRQ_*_READ
> > > LOCK_USED_IN_IRQ_*_READ -> LOCK_ENABLED_IRQ_*_READ
> > >
> > > To fix this, we need to 1) change the calculation of exclusive_mask() so
> > > that READ bits are not dropped and 2) always call usage() in
> > > mark_lock_irq() to check usage deadlocks, even when the new usage of the
> > > lock is READ.
> > >
> > > Besides, adjust usage_match() and usage_acculumate() to recursive read
> > > lock changes.
> > >
> > > Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> >
> > So our daily CI starts to trigger a warning (graph corruption?) below. From the
> > call traces, this recent patchset changed a few related things here and there.
> > Does it ring any bells?
> >
> > [14828.805563][T145183] lockdep bfs error:-1
>
> -1 is BFS_EQUEUEFULL, that means we hit the size limitation in lockdep
> searching, which is possible since recursive read deadlock detection
> tries to make the all edges (dependencies) searched. So maybe we should
> switch to DFS instead of BFS, I will look into this, in the meanwhile,
> could you try the following to see if it can help on the warnings you
> got?
>
Found a way to resolve this while still keeping the BFS. Every time when
we want to enqueue a lock_list, we basically enqueue a whole dep list of
entries from the previous lock_list, so we can use a trick here: instead
enqueue all the entries, we only enqueue the first entry and we can
fetch other silbing entries with list_next_or_null_rcu(). Patch as
below, I also took the chance to clear the code up and add more
comments. I could see this number (in /proc/lockdep_stats):
max bfs queue depth: 201
down to (after apply this patch)
max bfs queue depth: 61
with x86_64_defconfig along with lockdep and selftest configs.
Qian, could you give it a try?
Regards,
Boqun
---------------------->8
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 454355c033d2..1cc1302bf319 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -1640,35 +1640,22 @@ static enum bfs_result __bfs(struct lock_list *source_entry,
int offset)
{
struct lock_list *entry;
- struct lock_list *lock;
+ struct lock_list *lock = NULL;
struct list_head *head;
struct circular_queue *cq = &lock_cq;
- enum bfs_result ret = BFS_RNOMATCH;
lockdep_assert_locked();
- if (match(source_entry, data)) {
- *target_entry = source_entry;
- ret = BFS_RMATCH;
- goto exit;
- }
-
- head = get_dep_list(source_entry, offset);
- if (list_empty(head))
- goto exit;
-
__cq_init(cq);
__cq_enqueue(cq, source_entry);
- while ((lock = __cq_dequeue(cq))) {
- bool prev_only_xr;
-
- if (!lock->class) {
- ret = BFS_EINVALIDNODE;
- goto exit;
- }
+ while (lock || (lock = __cq_dequeue(cq))) {
+ if (!lock->class)
+ return BFS_EINVALIDNODE;
/*
+ * Step 1: check whether we already finish on this one.
+ *
* If we have visited all the dependencies from this @lock to
* others (iow, if we have visited all lock_list entries in
* @lock->class->locks_{after,before}) we skip, otherwise go
@@ -1676,17 +1663,17 @@ static enum bfs_result __bfs(struct lock_list *source_entry,
* list accessed.
*/
if (lock_accessed(lock))
- continue;
+ goto next;
else
mark_lock_accessed(lock);
- head = get_dep_list(lock, offset);
-
- prev_only_xr = lock->only_xr;
-
- list_for_each_entry_rcu(entry, head, entry) {
- unsigned int cq_depth;
- u8 dep = entry->dep;
+ /*
+ * Step 2: check whether prev dependency and this form a strong
+ * dependency path.
+ */
+ if (lock->parent) { /* Parent exists, check prev dependency */
+ u8 dep = lock->dep;
+ bool prev_only_xr = lock->parent->only_xr;
/*
* Mask out all -(S*)-> if we only have *R in previous
@@ -1698,29 +1685,68 @@ static enum bfs_result __bfs(struct lock_list *source_entry,
/* If nothing left, we skip */
if (!dep)
- continue;
+ goto next;
/* If there are only -(*R)-> left, set that for the next step */
- entry->only_xr = !(dep & (DEP_SN_MASK | DEP_EN_MASK));
+ lock->only_xr = !(dep & (DEP_SN_MASK | DEP_EN_MASK));
+ }
- visit_lock_entry(entry, lock);
- if (match(entry, data)) {
- *target_entry = entry;
- ret = BFS_RMATCH;
- goto exit;
- }
+ /*
+ * Step 3: we haven't visited this and there is a strong
+ * dependency path to this, so check with @match.
+ */
+ if (match(lock, data)) {
+ *target_entry = lock;
+ return BFS_RMATCH;
+ }
+
+ /*
+ * Step 4: if not match, expand the path by adding the
+ * afterwards or backwards dependencis in the search
+ *
+ * Note we only enqueue the first of the list into the queue,
+ * because we can always find a sibling dependency from one
+ * (see label 'next'), as a result the space of queue is saved.
+ */
+ head = get_dep_list(lock, offset);
+ entry = list_first_or_null_rcu(head, struct lock_list, entry);
+ if (entry) {
+ unsigned int cq_depth;
+
+ if (__cq_enqueue(cq, entry))
+ return BFS_EQUEUEFULL;
- if (__cq_enqueue(cq, entry)) {
- ret = BFS_EQUEUEFULL;
- goto exit;
- }
cq_depth = __cq_get_elem_count(cq);
if (max_bfs_queue_depth < cq_depth)
max_bfs_queue_depth = cq_depth;
}
+
+ /*
+ * Update the ->parent, so when @entry is iterated, we know the
+ * previous dependency.
+ */
+ list_for_each_entry_rcu(entry, head, entry)
+ visit_lock_entry(entry, lock);
+next:
+ /*
+ * Step 5: fetch the next dependency to process.
+ *
+ * If there is a previous dependency, we fetch the sibling
+ * dependency in the dep list of previous dependency.
+ *
+ * Otherwise set @lock to NULL to fetch the next entry from
+ * queue.
+ */
+ if (lock->parent) {
+ head = get_dep_list(lock->parent, offset);
+ lock = list_next_or_null_rcu(head, &lock->entry,
+ struct lock_list, entry);
+ } else {
+ lock = NULL;
+ }
}
-exit:
- return ret;
+
+ return BFS_RNOMATCH;
}
static inline enum bfs_result
next prev parent reply other threads:[~2020-09-16 18:01 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-07 7:42 [RFC v7 00/19] lockdep: Support deadlock detection for recursive read locks Boqun Feng
2020-08-07 7:42 ` [RFC v7 01/19] locking: More accurate annotations for read_lock() Boqun Feng
2020-08-27 7:54 ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-08-07 7:42 ` [RFC v7 02/19] lockdep/Documention: Recursive read lock detection reasoning Boqun Feng
2020-08-27 7:54 ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-08-07 7:42 ` [RFC v7 03/19] lockdep: Demagic the return value of BFS Boqun Feng
2020-08-27 7:54 ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-08-07 7:42 ` [RFC v7 04/19] lockdep: Make __bfs() visit every dependency until a match Boqun Feng
2020-08-27 7:54 ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-08-07 7:42 ` [RFC v7 05/19] lockdep: Reduce the size of lock_list::distance Boqun Feng
2020-08-27 7:54 ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-08-07 7:42 ` [RFC v7 06/19] lockdep: Introduce lock_list::dep Boqun Feng
2020-08-27 7:54 ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-08-07 7:42 ` [RFC v7 07/19] lockdep: Extend __bfs() to work with multiple types of dependencies Boqun Feng
2020-08-27 7:54 ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-08-07 7:42 ` [RFC v7 08/19] lockdep: Make __bfs(.match) return bool Boqun Feng
2020-08-27 7:54 ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-08-07 7:42 ` [RFC v7 09/19] lockdep: Support deadlock detection for recursive read locks in check_noncircular() Boqun Feng
2020-08-27 7:54 ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-08-07 7:42 ` [RFC v7 10/19] lockdep: Adjust check_redundant() for recursive read change Boqun Feng
2020-08-27 7:54 ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-08-07 7:42 ` [RFC v7 11/19] lockdep: Fix recursive read lock related safe->unsafe detection Boqun Feng
2020-08-27 7:54 ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-09-15 18:32 ` [RFC v7 11/19] " Qian Cai
2020-09-16 8:10 ` Boqun Feng
2020-09-16 16:14 ` Boqun Feng [this message]
2020-09-16 21:11 ` Qian Cai
2020-09-17 1:53 ` Boqun Feng
2020-08-07 7:42 ` [RFC v7 12/19] lockdep: Add recursive read locks into dependency graph Boqun Feng
2020-08-27 7:54 ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-09-14 18:16 ` [RFC v7 12/19] " Qian Cai
2020-09-14 22:04 ` Qian Cai
2020-08-07 7:42 ` [RFC v7 13/19] lockdep/selftest: Add a R-L/L-W test case specific to chain cache behavior Boqun Feng
2020-08-27 7:54 ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-08-07 7:42 ` [RFC v7 14/19] lockdep: Take read/write status in consideration when generate chainkey Boqun Feng
2020-08-21 17:41 ` Peter Zijlstra
2020-08-22 2:52 ` boqun.feng
2020-08-27 7:54 ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-08-07 7:42 ` [RFC v7 15/19] lockdep/selftest: Unleash irq_read_recursion2 and add more Boqun Feng
2020-08-27 7:54 ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-08-07 7:42 ` [RFC v7 16/19] lockdep/selftest: Add more recursive read related test cases Boqun Feng
2020-08-27 7:54 ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-08-07 7:42 ` [RFC v7 17/19] Revert "locking/lockdep/selftests: Fix mixed read-write ABBA tests" Boqun Feng
2020-08-27 7:54 ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-08-07 7:42 ` [RFC v7 18/19] locking/selftest: Add test cases for queued_read_lock() Boqun Feng
2020-08-27 7:54 ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-08-07 7:42 ` [RFC v7 19/19] lockdep/selftest: Introduce recursion3 Boqun Feng
2020-08-27 7:54 ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-08-21 19:56 ` [RFC v7 00/19] lockdep: Support deadlock detection for recursive read locks Peter Zijlstra
2020-08-23 1:12 ` boqun.feng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200916161404.GD127490@debian-boqun.qqnc3lrjykvubdpftowmye0fmh.lx.internal.cloudapp.net \
--to=boqun.feng@gmail.com \
--cc=cai@redhat.com \
--cc=corbet@lwn.net \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).