linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: mingo@kernel.org
Cc: byungchul.park@lge.com, tj@kernel.org, boqun.feng@gmail.com,
	david@fromorbit.com, johannes@sipsolutions.net, oleg@redhat.com,
	linux-kernel@vger.kernel.org, peterz@infradead.org
Subject: [PATCH 4/4] lockdep: Fix workqueue crossrelease annotation
Date: Wed, 23 Aug 2017 13:58:47 +0200	[thread overview]
Message-ID: <20170823121432.990701317@infradead.org> (raw)
In-Reply-To: 20170823115843.662056844@infradead.org

[-- Attachment #1: peterz-lockdep-cross-fix.patch --]
[-- Type: text/plain, Size: 7685 bytes --]

The new completion/crossrelease annotations interact unfavourable with
the extant flush_work()/flush_workqueue() annotations.

The problem is that when a single work class does:

  wait_for_completion(&C)

and

  complete(&C)

in different executions, we'll build dependencies like:

  lock_map_acquire(W)
  complete_acquire(C)

and

  lock_map_acquire(W)
  complete_release(C)

which results in the dependency chain: W->C->W, which lockdep thinks
spells deadlock, even though there is no deadlock potential since
works are ran concurrently.

One possibility would be to change the work 'lock' to recursive-read,
but that would mean hitting a lockdep limitation on recursive locks.
Also, unconditinoally switching to recursive-read here would fail to
detect the actual deadlock on single-threaded workqueues, which do
have a problem with this.

For now, forcefully disregard these locks for crossrelease.


Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/irqflags.h |    4 +--
 include/linux/lockdep.h  |    8 +++---
 kernel/locking/lockdep.c |   60 +++++++++++++++++++++++++++++------------------
 kernel/workqueue.c       |   23 +++++++++++++++++-
 4 files changed, 66 insertions(+), 29 deletions(-)

--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -26,7 +26,7 @@
 # define trace_hardirq_enter()			\
 do {						\
 	current->hardirq_context++;		\
-	crossrelease_hist_start(XHLOCK_HARD);	\
+	crossrelease_hist_start(XHLOCK_HARD, 0);\
 } while (0)
 # define trace_hardirq_exit()			\
 do {						\
@@ -36,7 +36,7 @@ do {						\
 # define lockdep_softirq_enter()		\
 do {						\
 	current->softirq_context++;		\
-	crossrelease_hist_start(XHLOCK_SOFT);	\
+	crossrelease_hist_start(XHLOCK_SOFT, 0);\
 } while (0)
 # define lockdep_softirq_exit()			\
 do {						\
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -578,11 +578,11 @@ extern void lock_commit_crosslock(struct
 #define STATIC_LOCKDEP_MAP_INIT(_name, _key) \
 	{ .name = (_name), .key = (void *)(_key), .cross = 0, }
 
-extern void crossrelease_hist_start(enum xhlock_context_t c);
+extern void crossrelease_hist_start(enum xhlock_context_t c, bool force);
 extern void crossrelease_hist_end(enum xhlock_context_t c);
 extern void lockdep_init_task(struct task_struct *task);
 extern void lockdep_free_task(struct task_struct *task);
-#else
+#else /* !CROSSRELEASE */
 #define lockdep_init_map_crosslock(m, n, k, s) do {} while (0)
 /*
  * To initialize a lockdep_map statically use this macro.
@@ -591,11 +591,11 @@ extern void lockdep_free_task(struct tas
 #define STATIC_LOCKDEP_MAP_INIT(_name, _key) \
 	{ .name = (_name), .key = (void *)(_key), }
 
-static inline void crossrelease_hist_start(enum xhlock_context_t c) {}
+static inline void crossrelease_hist_start(enum xhlock_context_t c, bool force) {}
 static inline void crossrelease_hist_end(enum xhlock_context_t c) {}
 static inline void lockdep_init_task(struct task_struct *task) {}
 static inline void lockdep_free_task(struct task_struct *task) {}
-#endif
+#endif /* CROSSRELEASE */
 
 #ifdef CONFIG_LOCK_STAT
 
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -4629,7 +4629,7 @@ asmlinkage __visible void lockdep_sys_ex
 	 * the index to point to the last entry, which is already invalid.
 	 */
 	crossrelease_hist_end(XHLOCK_PROC);
-	crossrelease_hist_start(XHLOCK_PROC);
+	crossrelease_hist_start(XHLOCK_PROC, false);
 }
 
 void lockdep_rcu_suspicious(const char *file, const int line, const char *s)
@@ -4725,25 +4725,25 @@ static inline void invalidate_xhlock(str
 /*
  * Lock history stacks; we have 3 nested lock history stacks:
  *
- *   Hard IRQ
- *   Soft IRQ
- *   History / Task
- *
- * The thing is that once we complete a (Hard/Soft) IRQ the future task locks
- * should not depend on any of the locks observed while running the IRQ.
- *
- * So what we do is rewind the history buffer and erase all our knowledge of
- * that temporal event.
- */
-
-/*
- * We need this to annotate lock history boundaries. Take for instance
- * workqueues; each work is independent of the last. The completion of a future
- * work does not depend on the completion of a past work (in general).
- * Therefore we must not carry that (lock) dependency across works.
+ *   HARD(IRQ)
+ *   SOFT(IRQ)
+ *   PROC(ess)
+ *
+ * The thing is that once we complete a HARD/SOFT IRQ the future task locks
+ * should not depend on any of the locks observed while running the IRQ.  So
+ * what we do is rewind the history buffer and erase all our knowledge of that
+ * temporal event.
+ *
+ * The PROCess one is special though; it is used to annotate independence
+ * inside a task.
+ *
+ * Take for instance workqueues; each work is independent of the last. The
+ * completion of a future work does not depend on the completion of a past work
+ * (in general). Therefore we must not carry that (lock) dependency across
+ * works.
  *
  * This is true for many things; pretty much all kthreads fall into this
- * pattern, where they have an 'idle' state and future completions do not
+ * pattern, where they have an invariant state and future completions do not
  * depend on past completions. Its just that since they all have the 'same'
  * form -- the kthread does the same over and over -- it doesn't typically
  * matter.
@@ -4751,15 +4751,31 @@ static inline void invalidate_xhlock(str
  * The same is true for system-calls, once a system call is completed (we've
  * returned to userspace) the next system call does not depend on the lock
  * history of the previous system call.
+ *
+ * They key property for independence, this invariant state, is that it must be
+ * a point where we hold no locks and have no history. Because if we were to
+ * hold locks, the restore at _end() would not necessarily recover it's history
+ * entry. Similarly, independence per-definition means it does not depend on
+ * prior state.
  */
-void crossrelease_hist_start(enum xhlock_context_t c)
+void crossrelease_hist_start(enum xhlock_context_t c, bool force)
 {
 	struct task_struct *cur = current;
 
-	if (cur->xhlocks) {
-		cur->xhlock_idx_hist[c] = cur->xhlock_idx;
-		cur->hist_id_save[c] = cur->hist_id;
+	if (!cur->xhlocks)
+		return;
+
+	/*
+	 * We call this at an invariant point, no current state, no history.
+	 */
+	if (c == XHLOCK_PROC) {
+		/* verified the former, ensure the latter */
+		WARN_ON_ONCE(!force && cur->lockdep_depth);
+		invalidate_xhlock(&xhlock(cur->xhlock_idx));
 	}
+
+	cur->xhlock_idx_hist[c] = cur->xhlock_idx;
+	cur->hist_id_save[c]    = cur->hist_id;
 }
 
 void crossrelease_hist_end(enum xhlock_context_t c)
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -2093,7 +2093,28 @@ __acquires(&pool->lock)
 
 	lock_map_acquire(&pwq->wq->lockdep_map);
 	lock_map_acquire(&lockdep_map);
-	crossrelease_hist_start(XHLOCK_PROC);
+	/*
+	 * Strictly speaking we should do start(PROC) without holding any
+	 * locks, that is, before these two lock_map_acquire()'s.
+	 *
+	 * However, that would result in:
+	 *
+	 *   A(W1)
+	 *   WFC(C)
+	 *		A(W1)
+	 *		C(C)
+	 *
+	 * Which would create W1->C->W1 dependencies, even though there is no
+	 * actual deadlock possible. There are two solutions, using a
+	 * read-recursive acquire on the work(queue) 'locks', but this will then
+	 * hit the lockdep limitation on recursive locks, or simly discard
+	 * these locks.
+	 *
+	 * AFAICT there is no possible deadlock scenario between the
+	 * flush_work() and complete() primitives (except for single-threaded
+	 * workqueues), so hiding them isn't a problem.
+	 */
+	crossrelease_hist_start(XHLOCK_PROC, true);
 	trace_workqueue_execute_start(work);
 	worker->current_func(work);
 	/*

  parent reply	other threads:[~2017-08-23 12:19 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-23 11:58 [PATCH 0/4] workqueue and lockdep stuffs Peter Zijlstra
2017-08-23 11:58 ` [PATCH 1/4] workqueue: Use TASK_IDLE Peter Zijlstra
2017-08-23 13:31   ` Tejun Heo
2017-08-23 11:58 ` [PATCH 2/4] lockdep/selftests: Add mixed read-write ABBA tests Peter Zijlstra
2017-08-23 11:58 ` [PATCH 3/4] workqueue/lockdep: Fix flush_work() annotation Peter Zijlstra
2017-08-23 11:58 ` Peter Zijlstra [this message]
2017-08-24  2:18   ` [PATCH 4/4] lockdep: Fix workqueue crossrelease annotation Byungchul Park
2017-08-24 14:02     ` Peter Zijlstra
2017-08-25  1:11       ` Byungchul Park
2017-08-29  8:59         ` Peter Zijlstra
2017-08-29 14:23           ` [tip:locking/core] locking/lockdep: Untangle xhlock history save/restore from task independence tip-bot for Peter Zijlstra
2017-08-29 16:02           ` [PATCH 4/4] lockdep: Fix workqueue crossrelease annotation Byungchul Park
2017-08-29 18:47             ` Peter Zijlstra
2017-08-30  2:09           ` Byungchul Park
2017-08-30  7:41             ` Byungchul Park
2017-08-30  8:53               ` Peter Zijlstra
2017-08-30  9:01                 ` Byungchul Park
2017-08-30  9:12                   ` Peter Zijlstra
2017-08-30  9:14                     ` Peter Zijlstra
2017-08-30  9:35                       ` Byungchul Park
2017-08-30  9:24                     ` Byungchul Park
2017-08-30 11:25                       ` Byungchul Park
2017-08-30 12:49                         ` Byungchul Park
2017-08-31  7:26                         ` Byungchul Park
2017-08-31  8:04                         ` Peter Zijlstra
2017-08-31  8:15                           ` Byungchul Park
2017-08-31  8:34                             ` Peter Zijlstra
2017-09-01  2:05                               ` Byungchul Park
2017-09-01  9:47                                 ` Peter Zijlstra
2017-09-01 10:16                                   ` Byungchul Park
2017-09-01 12:09                                     ` 박병철/선임연구원/SW Platform(연)AOT팀(byungchul.park@lge.com)
2017-09-01 12:38                                     ` Peter Zijlstra
2017-09-01 13:51                                       ` Byungchul Park
2017-09-01 16:38                                         ` Peter Zijlstra
2017-09-04  1:30                                           ` Byungchul Park
2017-09-04  2:08                                             ` Byungchul Park
2017-09-04 11:42                                             ` Peter Zijlstra
2017-09-05  0:38                                               ` Byungchul Park
2017-09-05  7:08                                                 ` Peter Zijlstra
2017-09-05  7:19                                                   ` Peter Zijlstra
2017-09-05  8:57                                                     ` Byungchul Park
2017-09-05  9:36                                                       ` Peter Zijlstra
2017-09-05 10:31                                                         ` Byungchul Park
2017-09-05 10:52                                                           ` Peter Zijlstra
2017-09-05 11:24                                                             ` Byungchul Park
2017-09-05 10:58                                                           ` Byungchul Park
2017-09-05 13:46                                                             ` Peter Zijlstra
2017-09-05 23:52                                                               ` Byungchul Park
2017-09-06  0:42                                                                 ` Boqun Feng
2017-09-06  1:32                                                                   ` Byungchul Park
2017-09-06 23:59                                                                     ` Byungchul Park
2017-09-07  0:11                                                                     ` Byungchul Park
2017-09-06  0:48                                                               ` Byungchul Park
2017-09-05  8:30                                                   ` Byungchul Park
2017-08-31  8:07                       ` Peter Zijlstra
2017-08-25  4:39       ` Byungchul Park
2017-08-29  6:46   ` Byungchul Park
2017-08-29  9:01     ` Peter Zijlstra
2017-08-29 16:12       ` Byungchul Park
2017-08-23 13:32 ` [PATCH 0/4] workqueue and lockdep stuffs Tejun Heo
2017-08-23 13:45   ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170823121432.990701317@infradead.org \
    --to=peterz@infradead.org \
    --cc=boqun.feng@gmail.com \
    --cc=byungchul.park@lge.com \
    --cc=david@fromorbit.com \
    --cc=johannes@sipsolutions.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=oleg@redhat.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).