LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Bart Van Assche <bvanassche@acm.org>
To: peterz@infradead.org
Cc: mingo@redhat.com, will.deacon@arm.com, tj@kernel.org,
	longman@redhat.com, johannes.berg@intel.com,
	linux-kernel@vger.kernel.org,
	Bart Van Assche <bvanassche@acm.org>
Subject: [PATCH v7 19/23] kernel/workqueue: Use dynamic lockdep keys for workqueues
Date: Thu, 14 Feb 2019 15:00:54 -0800
Message-ID: <20190214230058.196511-20-bvanassche@acm.org> (raw)
In-Reply-To: <20190214230058.196511-1-bvanassche@acm.org>

Commit 87915adc3f0a ("workqueue: re-add lockdep dependencies for flushing")
improved deadlock checking in the workqueue implementation. Unfortunately
that patch also introduced a few false positive lockdep complaints. This
patch suppresses these false positives by allocating the workqueue mutex
lockdep key dynamically. An example of a false positive lockdep complaint
suppressed by this report can be found below. The root cause of the
lockdep complaint shown below is that the direct I/O code can call
alloc_workqueue() from inside a work item created by another
alloc_workqueue() call and that both workqueues share the same lockdep
key. This patch avoids that that lockdep complaint is triggered by
allocating the work queue lockdep keys dynamically. In other words, this
patch guarantees that a unique lockdep key is associated with each work
queue mutex.

======================================================
WARNING: possible circular locking dependency detected
4.19.0-dbg+ #1 Not tainted
------------------------------------------------------
fio/4129 is trying to acquire lock:
00000000a01cfe1a ((wq_completion)"dio/%s"sb->s_id){+.+.}, at: flush_workqueue+0xd0/0x970

but task is already holding lock:
00000000a0acecf9 (&sb->s_type->i_mutex_key#14){+.+.}, at: ext4_file_write_iter+0x154/0x710

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&sb->s_type->i_mutex_key#14){+.+.}:
       down_write+0x3d/0x80
       __generic_file_fsync+0x77/0xf0
       ext4_sync_file+0x3c9/0x780
       vfs_fsync_range+0x66/0x100
       dio_complete+0x2f5/0x360
       dio_aio_complete_work+0x1c/0x20
       process_one_work+0x481/0x9f0
       worker_thread+0x63/0x5a0
       kthread+0x1cf/0x1f0
       ret_from_fork+0x24/0x30

-> #1 ((work_completion)(&dio->complete_work)){+.+.}:
       process_one_work+0x447/0x9f0
       worker_thread+0x63/0x5a0
       kthread+0x1cf/0x1f0
       ret_from_fork+0x24/0x30

-> #0 ((wq_completion)"dio/%s"sb->s_id){+.+.}:
       lock_acquire+0xc5/0x200
       flush_workqueue+0xf3/0x970
       drain_workqueue+0xec/0x220
       destroy_workqueue+0x23/0x350
       sb_init_dio_done_wq+0x6a/0x80
       do_blockdev_direct_IO+0x1f33/0x4be0
       __blockdev_direct_IO+0x79/0x86
       ext4_direct_IO+0x5df/0xbb0
       generic_file_direct_write+0x119/0x220
       __generic_file_write_iter+0x131/0x2d0
       ext4_file_write_iter+0x3fa/0x710
       aio_write+0x235/0x330
       io_submit_one+0x510/0xeb0
       __x64_sys_io_submit+0x122/0x340
       do_syscall_64+0x71/0x220
       entry_SYSCALL_64_after_hwframe+0x49/0xbe

other info that might help us debug this:

Chain exists of:
  (wq_completion)"dio/%s"sb->s_id --> (work_completion)(&dio->complete_work) --> &sb->s_type->i_mutex_key#14

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&sb->s_type->i_mutex_key#14);
                               lock((work_completion)(&dio->complete_work));
                               lock(&sb->s_type->i_mutex_key#14);
  lock((wq_completion)"dio/%s"sb->s_id);

 *** DEADLOCK ***

1 lock held by fio/4129:
 #0: 00000000a0acecf9 (&sb->s_type->i_mutex_key#14){+.+.}, at: ext4_file_write_iter+0x154/0x710

stack backtrace:
CPU: 3 PID: 4129 Comm: fio Not tainted 4.19.0-dbg+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
 dump_stack+0x86/0xc5
 print_circular_bug.isra.32+0x20a/0x218
 __lock_acquire+0x1c68/0x1cf0
 lock_acquire+0xc5/0x200
 flush_workqueue+0xf3/0x970
 drain_workqueue+0xec/0x220
 destroy_workqueue+0x23/0x350
 sb_init_dio_done_wq+0x6a/0x80
 do_blockdev_direct_IO+0x1f33/0x4be0
 __blockdev_direct_IO+0x79/0x86
 ext4_direct_IO+0x5df/0xbb0
 generic_file_direct_write+0x119/0x220
 __generic_file_write_iter+0x131/0x2d0
 ext4_file_write_iter+0x3fa/0x710
 aio_write+0x235/0x330
 io_submit_one+0x510/0xeb0
 __x64_sys_io_submit+0x122/0x340
 do_syscall_64+0x71/0x220
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 include/linux/workqueue.h | 28 +++----------------
 kernel/workqueue.c        | 59 +++++++++++++++++++++++++++++++++------
 2 files changed, 54 insertions(+), 33 deletions(-)

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 60d673e15632..d9a1a480e920 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -390,43 +390,23 @@ extern struct workqueue_struct *system_freezable_wq;
 extern struct workqueue_struct *system_power_efficient_wq;
 extern struct workqueue_struct *system_freezable_power_efficient_wq;
 
-extern struct workqueue_struct *
-__alloc_workqueue_key(const char *fmt, unsigned int flags, int max_active,
-	struct lock_class_key *key, const char *lock_name, ...) __printf(1, 6);
-
 /**
  * alloc_workqueue - allocate a workqueue
  * @fmt: printf format for the name of the workqueue
  * @flags: WQ_* flags
  * @max_active: max in-flight work items, 0 for default
- * @args...: args for @fmt
+ * remaining args: args for @fmt
  *
  * Allocate a workqueue with the specified parameters.  For detailed
  * information on WQ_* flags, please refer to
  * Documentation/core-api/workqueue.rst.
  *
- * The __lock_name macro dance is to guarantee that single lock_class_key
- * doesn't end up with different namesm, which isn't allowed by lockdep.
- *
  * RETURNS:
  * Pointer to the allocated workqueue on success, %NULL on failure.
  */
-#ifdef CONFIG_LOCKDEP
-#define alloc_workqueue(fmt, flags, max_active, args...)		\
-({									\
-	static struct lock_class_key __key;				\
-	const char *__lock_name;					\
-									\
-	__lock_name = "(wq_completion)"#fmt#args;			\
-									\
-	__alloc_workqueue_key((fmt), (flags), (max_active),		\
-			      &__key, __lock_name, ##args);		\
-})
-#else
-#define alloc_workqueue(fmt, flags, max_active, args...)		\
-	__alloc_workqueue_key((fmt), (flags), (max_active),		\
-			      NULL, NULL, ##args)
-#endif
+struct workqueue_struct *alloc_workqueue(const char *fmt,
+					 unsigned int flags,
+					 int max_active, ...);
 
 /**
  * alloc_ordered_workqueue - allocate an ordered workqueue
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index fc5d23d752a5..e163e7a7f5e5 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -259,6 +259,8 @@ struct workqueue_struct {
 	struct wq_device	*wq_dev;	/* I: for sysfs interface */
 #endif
 #ifdef CONFIG_LOCKDEP
+	char			*lock_name;
+	struct lock_class_key	key;
 	struct lockdep_map	lockdep_map;
 #endif
 	char			name[WQ_NAME_LEN]; /* I: workqueue name */
@@ -3337,11 +3339,49 @@ static int init_worker_pool(struct worker_pool *pool)
 	return 0;
 }
 
+#ifdef CONFIG_LOCKDEP
+static void wq_init_lockdep(struct workqueue_struct *wq)
+{
+	char *lock_name;
+
+	lockdep_register_key(&wq->key);
+	lock_name = kasprintf(GFP_KERNEL, "%s%s", "(wq_completion)", wq->name);
+	if (!lock_name)
+		lock_name = wq->name;
+	lockdep_init_map(&wq->lockdep_map, lock_name, &wq->key, 0);
+}
+
+static void wq_unregister_lockdep(struct workqueue_struct *wq)
+{
+	lockdep_unregister_key(&wq->key);
+}
+
+static void wq_free_lockdep(struct workqueue_struct *wq)
+{
+	if (wq->lock_name != wq->name)
+		kfree(wq->lock_name);
+}
+#else
+static void wq_init_lockdep(struct workqueue_struct *wq)
+{
+}
+
+static void wq_unregister_lockdep(struct workqueue_struct *wq)
+{
+}
+
+static void wq_free_lockdep(struct workqueue_struct *wq)
+{
+}
+#endif
+
 static void rcu_free_wq(struct rcu_head *rcu)
 {
 	struct workqueue_struct *wq =
 		container_of(rcu, struct workqueue_struct, rcu);
 
+	wq_free_lockdep(wq);
+
 	if (!(wq->flags & WQ_UNBOUND))
 		free_percpu(wq->cpu_pwqs);
 	else
@@ -3532,8 +3572,10 @@ static void pwq_unbound_release_workfn(struct work_struct *work)
 	 * If we're the last pwq going away, @wq is already dead and no one
 	 * is gonna access it anymore.  Schedule RCU free.
 	 */
-	if (is_last)
+	if (is_last) {
+		wq_unregister_lockdep(wq);
 		call_rcu(&wq->rcu, rcu_free_wq);
+	}
 }
 
 /**
@@ -4067,11 +4109,9 @@ static int init_rescuer(struct workqueue_struct *wq)
 	return 0;
 }
 
-struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
-					       unsigned int flags,
-					       int max_active,
-					       struct lock_class_key *key,
-					       const char *lock_name, ...)
+struct workqueue_struct *alloc_workqueue(const char *fmt,
+					 unsigned int flags,
+					 int max_active, ...)
 {
 	size_t tbl_size = 0;
 	va_list args;
@@ -4106,7 +4146,7 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
 			goto err_free_wq;
 	}
 
-	va_start(args, lock_name);
+	va_start(args, max_active);
 	vsnprintf(wq->name, sizeof(wq->name), fmt, args);
 	va_end(args);
 
@@ -4123,7 +4163,7 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
 	INIT_LIST_HEAD(&wq->flusher_overflow);
 	INIT_LIST_HEAD(&wq->maydays);
 
-	lockdep_init_map(&wq->lockdep_map, lock_name, key, 0);
+	wq_init_lockdep(wq);
 	INIT_LIST_HEAD(&wq->list);
 
 	if (alloc_and_link_pwqs(wq) < 0)
@@ -4161,7 +4201,7 @@ struct workqueue_struct *__alloc_workqueue_key(const char *fmt,
 	destroy_workqueue(wq);
 	return NULL;
 }
-EXPORT_SYMBOL_GPL(__alloc_workqueue_key);
+EXPORT_SYMBOL_GPL(alloc_workqueue);
 
 /**
  * destroy_workqueue - safely terminate a workqueue
@@ -4214,6 +4254,7 @@ void destroy_workqueue(struct workqueue_struct *wq)
 		kthread_stop(wq->rescuer->task);
 
 	if (!(wq->flags & WQ_UNBOUND)) {
+		wq_unregister_lockdep(wq);
 		/*
 		 * The base ref is never dropped on per-cpu pwqs.  Directly
 		 * schedule RCU free.
-- 
2.21.0.rc0.258.g878e2cd30e-goog


  parent reply index

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-14 23:00 [PATCH v7 00/23] locking/lockdep: Add support for dynamic keys Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 01/23] locking/lockdep: Fix two 32-bit compiler warnings Bart Van Assche
2019-02-28  7:02   ` [tip:locking/core] " tip-bot for Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 02/23] locking/lockdep: Fix reported required memory size (1/2) Bart Van Assche
2019-02-28  7:03   ` [tip:locking/core] " tip-bot for Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 03/23] locking/lockdep: Fix reported required memory size (2/2) Bart Van Assche
2019-02-28  7:03   ` [tip:locking/core] " tip-bot for Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 04/23] locking/lockdep: Avoid that add_chain_cache() adds an invalid chain to the cache Bart Van Assche
2019-02-28  7:04   ` [tip:locking/core] " tip-bot for Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 05/23] locking/lockdep: Reorder struct lock_class members Bart Van Assche
2019-02-28  7:05   ` [tip:locking/core] " tip-bot for Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 06/23] locking/lockdep: Make zap_class() remove all matching lock order entries Bart Van Assche
2019-02-28  7:05   ` [tip:locking/core] " tip-bot for Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 07/23] locking/lockdep: Initialize the locks_before and locks_after lists earlier Bart Van Assche
2019-02-28  7:06   ` [tip:locking/core] " tip-bot for Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 08/23] locking/lockdep: Split lockdep_free_key_range() and lockdep_reset_lock() Bart Van Assche
2019-02-28  7:07   ` [tip:locking/core] " tip-bot for Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 09/23] locking/lockdep: Make it easy to detect whether or not inside a selftest Bart Van Assche
2019-02-28  7:07   ` [tip:locking/core] " tip-bot for Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 10/23] locking/lockdep: Update two outdated comments Bart Van Assche
2019-02-28  7:08   ` [tip:locking/core] " tip-bot for Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 11/23] locking/lockdep: Free lock classes that are no longer in use Bart Van Assche
2019-02-28  7:09   ` [tip:locking/core] " tip-bot for Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 12/23] locking/lockdep: Reuse list entries " Bart Van Assche
2019-02-28  7:09   ` [tip:locking/core] " tip-bot for Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 13/23] locking/lockdep: Introduce lockdep_next_lockchain() and lock_chain_count() Bart Van Assche
2019-02-28  7:10   ` [tip:locking/core] " tip-bot for Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 14/23] locking/lockdep: Fix a comment in add_chain_cache() Bart Van Assche
2019-02-28  7:11   ` [tip:locking/core] " tip-bot for Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 15/23] locking/lockdep: Reuse lock chains that have been freed Bart Van Assche
2019-02-28  7:11   ` [tip:locking/core] " tip-bot for Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 16/23] locking/lockdep: Check data structure consistency Bart Van Assche
2019-02-28  7:12   ` [tip:locking/core] " tip-bot for Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 17/23] locking/lockdep: Verify whether lock objects are small enough to be used as class keys Bart Van Assche
2019-02-28  7:13   ` [tip:locking/core] " tip-bot for Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 18/23] locking/lockdep: Add support for dynamic keys Bart Van Assche
2019-02-26 17:17   ` Peter Zijlstra
2019-02-28  7:13   ` [tip:locking/core] " tip-bot for Bart Van Assche
2019-02-14 23:00 ` Bart Van Assche [this message]
2019-02-28  7:14   ` [tip:locking/core] kernel/workqueue: Use dynamic lockdep keys for workqueues tip-bot for Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 20/23] locking/spinlock: Introduce spin_lock_init_key() Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 21/23] block: Avoid that flushing triggers a lockdep complaint Bart Van Assche
2019-02-15  2:26   ` Ming Lei
2019-02-15 16:08     ` Bart Van Assche
2019-02-17 13:23       ` Ming Lei
2019-02-26 18:08     ` Peter Zijlstra
2019-02-27  1:35       ` Ming Lei
2019-02-27 14:24         ` Peter Zijlstra
2019-02-27 15:53           ` Ming Lei
2019-02-26 17:24   ` Peter Zijlstra
2019-02-26 17:48     ` Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 22/23] lockdep tests: Fix run_tests.sh Bart Van Assche
2019-02-28  7:15   ` [tip:locking/core] lockdep/lib/tests: " tip-bot for Bart Van Assche
2019-02-14 23:00 ` [PATCH v7 23/23] lockdep tests: Test dynamic key registration Bart Van Assche
2019-02-28  7:15   ` [tip:locking/core] lockdep/lib/tests: " tip-bot for Bart Van Assche
2019-02-21 22:02 ` [PATCH v7 00/23] locking/lockdep: Add support for dynamic keys Bart Van Assche
2019-02-22 16:26   ` Peter Zijlstra
2019-02-22 17:20     ` Bart Van Assche
2019-02-22 22:13       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190214230058.196511-20-bvanassche@acm.org \
    --to=bvanassche@acm.org \
    --cc=johannes.berg@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tj@kernel.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git