From: Mike Christie <mchristi@redhat.com>
To: Sun Ke <sunke32@huawei.com>, josef@toxicpanda.com
Cc: axboe@kernel.dk, linux-block@vger.kernel.org,
nbd@other.debian.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] nbd: fix potential deadlock in nbd_config_put()
Date: Sun, 1 Dec 2019 11:11:32 -0600 [thread overview]
Message-ID: <5DE3F444.9080706@redhat.com> (raw)
In-Reply-To: <1574937951-92828-1-git-send-email-sunke32@huawei.com>
[-- Attachment #1: Type: text/plain, Size: 3404 bytes --]
On 11/28/2019 04:45 AM, Sun Ke wrote:
> I got a deadlock report from syzkaller:
>
> [ 234.427696] ============================================
> [ 234.428327] WARNING: possible recursive locking detected
> [ 234.429011] 5.4.0-rc4+ #1 Not tainted
> [ 234.429528] --------------------------------------------
> [ 234.430162] kworker/u9:0/894 is trying to acquire lock:
> [ 234.430911] ffff0000d3aca128 ((wq_completion)knbd0-recv){+.+.}, at: flush_workqueue+0xbc/0xfe8
> [ 234.432330]
> [ 234.432330] but task is already holding lock:
> [ 234.432927] ffff0000d3aca128 ((wq_completion)knbd0-recv){+.+.}, at: process_one_work+0x6a0/0x17e8
> [ 234.433983]
> [ 234.433983] other info that might help us debug this:
> [ 234.434615] Possible unsafe locking scenario:
> [ 234.434615]
> [ 234.435263] CPU0
> [ 234.435613] ----
> [ 234.436019] lock((wq_completion)knbd0-recv);
> [ 234.436521] lock((wq_completion)knbd0-recv);
> [ 234.437166]
> [ 234.437166] *** DEADLOCK ***
> [ 234.437166]
> [ 234.437763] May be due to missing lock nesting notation
> [ 234.437763]
> [ 234.438559] 3 locks held by kworker/u9:0/894:
> [ 234.439040] #0: ffff0000d3aca128 ((wq_completion)knbd0-recv){+.+.}, at: process_one_work+0x6a0/0x17e8
> [ 234.440185] #1: ffff0000d344fd50 ((work_completion)(&args->work)){+.+.}, at: process_one_work+0x6a0/0x17e8
> [ 234.442209] #2: ffff0000d723cd78 (&nbd->config_lock){+.+.}, at: refcount_dec_and_mutex_lock+0x5c/0x128
> [ 234.443380]
> [ 234.443380] stack backtrace:
> [ 234.444271] CPU: 3 PID: 894 Comm: kworker/u9:0 Not tainted 5.4.0-rc4+ #1
> [ 234.444989] Hardware name: linux,dummy-virt (DT)
> [ 234.446077] Workqueue: knbd0-recv recv_work
> [ 234.446909] Call trace:
> [ 234.447372] dump_backtrace+0x0/0x358
> [ 234.447877] show_stack+0x28/0x38
> [ 234.448347] dump_stack+0x15c/0x1ec
> [ 234.448838] __lock_acquire+0x12ec/0x2f78
> [ 234.449474] lock_acquire+0x180/0x590
> [ 234.450075] flush_workqueue+0x104/0xfe8
> [ 234.450587] drain_workqueue+0x164/0x390
> [ 234.451090] destroy_workqueue+0x30/0x560
> [ 234.451598] nbd_config_put+0x308/0x700
> [ 234.452093] recv_work+0x198/0x1f0
> [ 234.452556] process_one_work+0x7ac/0x17e8
> [ 234.453189] worker_thread+0x36c/0xb70
> [ 234.453788] kthread+0x2f4/0x378
> [ 234.454257] ret_from_fork+0x10/0x18
>
> The root cause is recv_work() is the last one to drop the config
> ref and try to destroy the workqueue from inside the work queue.
>
> There are two ways to fix the bug. The first way is flushing the
> workqueue before dropping the initial refcount and making sure
> recv_work() will not be the last owner of nbd_config. However it
> is hard for ioctl interface. Because nbd_clear_sock_ioctl() may
> not be invoked, so we need to flush the workqueue in nbd_release()
> and that will lead to another deadlock because recv_work can not
> exit from nbd_read_stat() loop.
>
> The second way is using another work to put nbd_config asynchronously
> for recv_work().
>
Can we also increment the refcount before we do wait_event_interruptible
in nbd_start_device_ioctl, always flush the workqueue then drop the
refcount after the flush? See compile tested only patch.
We will then also know the recv side has stopped after
nbd_start_device_ioctl has returned, so new NBD_DO_IT calls for the same
device do not race with shutdowns of previous uses.
[-- Attachment #2: nbd-move-flush.patch --]
[-- Type: text/x-patch, Size: 933 bytes --]
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 57532465fb83..f8597d2fb365 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1293,13 +1293,15 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b
if (max_part)
bdev->bd_invalidated = 1;
+
+ refcount_inc(&nbd->config_refs);
mutex_unlock(&nbd->config_lock);
ret = wait_event_interruptible(config->recv_wq,
atomic_read(&config->recv_threads) == 0);
- if (ret) {
+ if (ret)
sock_shutdown(nbd);
- flush_workqueue(nbd->recv_workq);
- }
+ flush_workqueue(nbd->recv_workq);
+
mutex_lock(&nbd->config_lock);
nbd_bdev_reset(bdev);
/* user requested, ignore socket errors */
@@ -1307,6 +1309,7 @@ static int nbd_start_device_ioctl(struct nbd_device *nbd, struct block_device *b
ret = 0;
if (test_bit(NBD_RT_TIMEDOUT, &config->runtime_flags))
ret = -ETIMEDOUT;
+ nbd_config_put(nbd);
return ret;
}
next prev parent reply other threads:[~2019-12-01 17:11 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-28 10:45 [PATCH] nbd: fix potential deadlock in nbd_config_put() Sun Ke
2019-12-01 17:11 ` Mike Christie [this message]
2019-12-03 12:22 ` sunke (E)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5DE3F444.9080706@redhat.com \
--to=mchristi@redhat.com \
--cc=axboe@kernel.dk \
--cc=josef@toxicpanda.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nbd@other.debian.org \
--cc=sunke32@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).