From: Bart Van Assche <bvanassche@acm.org> To: Tejun Heo <tj@kernel.org> Cc: linux-kernel@vger.kernel.org, Johannes Berg <johannes.berg@intel.com>, Christoph Hellwig <hch@lst.de>, Sagi Grimberg <sagi@grimberg.me>, tytso@mit.edu, bvanassche@acm.org Subject: [PATCH 3/3] kernel/workqueue: Suppress a false positive lockdep complaint Date: Thu, 25 Oct 2018 08:05:40 -0700 [thread overview] Message-ID: <20181025150540.259281-4-bvanassche@acm.org> (raw) In-Reply-To: <20181025150540.259281-1-bvanassche@acm.org> It can happen that the direct I/O queue creates and destroys an empty workqueue from inside a work function. Avoid that this triggers the false positive lockdep complaint shown below. ====================================================== WARNING: possible circular locking dependency detected 4.19.0-dbg+ #1 Not tainted ------------------------------------------------------ fio/4129 is trying to acquire lock: 00000000a01cfe1a ((wq_completion)"dio/%s"sb->s_id){+.+.}, at: flush_workqueue+0xd0/0x970 but task is already holding lock: 00000000a0acecf9 (&sb->s_type->i_mutex_key#14){+.+.}, at: ext4_file_write_iter+0x154/0x710 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&sb->s_type->i_mutex_key#14){+.+.}: down_write+0x3d/0x80 __generic_file_fsync+0x77/0xf0 ext4_sync_file+0x3c9/0x780 vfs_fsync_range+0x66/0x100 dio_complete+0x2f5/0x360 dio_aio_complete_work+0x1c/0x20 process_one_work+0x481/0x9f0 worker_thread+0x63/0x5a0 kthread+0x1cf/0x1f0 ret_from_fork+0x24/0x30 -> #1 ((work_completion)(&dio->complete_work)){+.+.}: process_one_work+0x447/0x9f0 worker_thread+0x63/0x5a0 kthread+0x1cf/0x1f0 ret_from_fork+0x24/0x30 -> #0 ((wq_completion)"dio/%s"sb->s_id){+.+.}: lock_acquire+0xc5/0x200 flush_workqueue+0xf3/0x970 drain_workqueue+0xec/0x220 destroy_workqueue+0x23/0x350 sb_init_dio_done_wq+0x6a/0x80 do_blockdev_direct_IO+0x1f33/0x4be0 __blockdev_direct_IO+0x79/0x86 ext4_direct_IO+0x5df/0xbb0 generic_file_direct_write+0x119/0x220 __generic_file_write_iter+0x131/0x2d0 ext4_file_write_iter+0x3fa/0x710 aio_write+0x235/0x330 io_submit_one+0x510/0xeb0 __x64_sys_io_submit+0x122/0x340 do_syscall_64+0x71/0x220 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Chain exists of: (wq_completion)"dio/%s"sb->s_id --> (work_completion)(&dio->complete_work) --> &sb->s_type->i_mutex_key#14 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&sb->s_type->i_mutex_key#14); lock((work_completion)(&dio->complete_work)); lock(&sb->s_type->i_mutex_key#14); lock((wq_completion)"dio/%s"sb->s_id); *** DEADLOCK *** 1 lock held by fio/4129: #0: 00000000a0acecf9 (&sb->s_type->i_mutex_key#14){+.+.}, at: ext4_file_write_iter+0x154/0x710 stack backtrace: CPU: 3 PID: 4129 Comm: fio Not tainted 4.19.0-dbg+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 Call Trace: dump_stack+0x86/0xc5 print_circular_bug.isra.32+0x20a/0x218 __lock_acquire+0x1c68/0x1cf0 lock_acquire+0xc5/0x200 flush_workqueue+0xf3/0x970 drain_workqueue+0xec/0x220 destroy_workqueue+0x23/0x350 sb_init_dio_done_wq+0x6a/0x80 do_blockdev_direct_IO+0x1f33/0x4be0 __blockdev_direct_IO+0x79/0x86 ext4_direct_IO+0x5df/0xbb0 generic_file_direct_write+0x119/0x220 __generic_file_write_iter+0x131/0x2d0 ext4_file_write_iter+0x3fa/0x710 aio_write+0x235/0x330 io_submit_one+0x510/0xeb0 __x64_sys_io_submit+0x122/0x340 do_syscall_64+0x71/0x220 entry_SYSCALL_64_after_hwframe+0x49/0xbe Cc: Johannes Berg <johannes.berg@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Bart Van Assche <bvanassche@acm.org> --- include/linux/workqueue.h | 1 + kernel/workqueue.c | 6 +++++- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h index 60d673e15632..375ec764f148 100644 --- a/include/linux/workqueue.h +++ b/include/linux/workqueue.h @@ -344,6 +344,7 @@ enum { __WQ_ORDERED = 1 << 17, /* internal: workqueue is ordered */ __WQ_LEGACY = 1 << 18, /* internal: create*_workqueue() */ __WQ_ORDERED_EXPLICIT = 1 << 19, /* internal: alloc_ordered_workqueue() */ + __WQ_HAS_BEEN_USED = 1 << 20, /* internal: work has been queued */ WQ_MAX_ACTIVE = 512, /* I like 512, better ideas? */ WQ_MAX_UNBOUND_PER_CPU = 4, /* 4 * #cpus for unbound wq */ diff --git a/kernel/workqueue.c b/kernel/workqueue.c index fc9129d5909e..0ef275fe526c 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -1383,6 +1383,10 @@ static void __queue_work(int cpu, struct workqueue_struct *wq, if (unlikely(wq->flags & __WQ_DRAINING) && WARN_ON_ONCE(!is_chained_work(wq))) return; + + if (!(wq->flags & __WQ_HAS_BEEN_USED)) + wq->flags |= __WQ_HAS_BEEN_USED; + retry: if (req_cpu == WORK_CPU_UNBOUND) cpu = wq_select_unbound_cpu(raw_smp_processor_id()); @@ -2889,7 +2893,7 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr, * workqueues the deadlock happens when the rescuer stalls, blocking * forward progress. */ - if (!from_cancel && + if (!from_cancel && (pwq->wq->flags & __WQ_HAS_BEEN_USED) && (pwq->wq->saved_max_active == 1 || pwq->wq->rescuer)) { lock_acquire_exclusive(&pwq->wq->lockdep_map, 0, 0, NULL, _THIS_IP_); -- 2.19.1.568.g152ad8e336-goog
next prev parent reply other threads:[~2018-10-25 15:07 UTC|newest] Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-10-25 15:05 [PATCH 0/3] Suppress false positives triggered by workqueue lockdep annotations Bart Van Assche 2018-10-25 15:05 ` [PATCH 1/3] kernel/workqueue: Remove lockdep annotation from __flush_work() Bart Van Assche 2018-10-25 15:31 ` Johannes Berg 2018-10-25 15:57 ` Johannes Berg 2018-10-25 16:01 ` Bart Van Assche 2018-10-25 15:05 ` [PATCH 2/3] kernel/workqueue: Surround work execution with shared lock annotations Bart Van Assche 2018-10-25 16:53 ` Johannes Berg 2018-10-25 17:22 ` Bart Van Assche 2018-10-25 19:17 ` Johannes Berg 2018-10-25 15:05 ` Bart Van Assche [this message] 2018-10-25 15:34 ` [PATCH 3/3] kernel/workqueue: Suppress a false positive lockdep complaint Johannes Berg 2018-10-25 15:55 ` Bart Van Assche 2018-10-25 19:59 ` Johannes Berg 2018-10-25 20:21 ` Theodore Y. Ts'o 2018-10-25 20:26 ` Johannes Berg 2018-10-25 15:36 ` Tejun Heo 2018-10-25 15:37 ` Tejun Heo 2018-10-25 20:13 ` Johannes Berg 2018-10-25 15:40 ` Theodore Y. Ts'o 2018-10-25 17:02 ` Johannes Berg 2018-10-25 17:11 ` Bart Van Assche 2018-10-25 19:51 ` Johannes Berg 2018-10-25 20:39 ` Bart Van Assche 2018-10-25 20:47 ` Johannes Berg 2018-10-25 15:27 ` [PATCH 0/3] Suppress false positives triggered by workqueue lockdep annotations Johannes Berg 2018-10-25 15:47 ` Bart Van Assche
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20181025150540.259281-4-bvanassche@acm.org \ --to=bvanassche@acm.org \ --cc=hch@lst.de \ --cc=johannes.berg@intel.com \ --cc=linux-kernel@vger.kernel.org \ --cc=sagi@grimberg.me \ --cc=tj@kernel.org \ --cc=tytso@mit.edu \ --subject='Re: [PATCH 3/3] kernel/workqueue: Suppress a false positive lockdep complaint' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).