All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
To: Tyler Hicks <tyhicks@linux.microsoft.com>,
	Pavel Tatashin <pasha.tatashin@soleen.com>,
	Petr Vorel <pvorel@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>,
	Stefan Haberland <sth@linux.ibm.com>,
	Jan Hoeppner <hoeppner@linux.ibm.com>,
	linux-block <linux-block@vger.kernel.org>,
	syzbot <syzbot+61e04e51b7ac86930589@syzkaller.appspotmail.com>,
	Tejun Heo <tj@kernel.org>
Subject: Re: [syzbot] possible deadlock in del_gendisk
Date: Thu, 10 Jun 2021 22:37:48 +0900	[thread overview]
Message-ID: <718b44b5-a230-1907-e1e1-9f609cb67322@i-love.sakura.ne.jp> (raw)
In-Reply-To: <20210609164639.GM4910@sequoia>

On 2021/06/10 1:46, Tyler Hicks wrote:
> On 2021-06-10 01:31:17, Tetsuo Handa wrote:
>> Hello, Christoph.
>>
>> I'm currently trying full bisection.

I reached commit c76f48eb5c084b1e ("block: take bd_mutex around delete_partitions in del_gendisk").

  # bad: [fc0586062816559defb14c947319ef8c4c326fb3] Merge tag 'for-5.13/drivers-2021-04-27' of git://git.kernel.dk/linux-block
  # good: [42dec9a936e7696bea1f27d3c5a0068cd9aa95fd] Merge tag 'perf-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
  # good: [68a32ba14177d4a21c4a9a941cf1d7aea86d436f] Merge tag 'drm-next-2021-04-28' of git://anongit.freedesktop.org/drm/drm
  # good: [a800abd3ecb9acc55821f7ac9bba6c956b36a595] net: enetc: move skb creation into enetc_build_skb
  # good: [6cc8e7430801fa238bd7d3acae1eb406c6e02fe1] loop: scale loop device by introducing per device lock
  git bisect start 'fc0586062816559defb14c947319ef8c4c326fb3' '42dec9a936e7696bea1f27d3c5a0068cd9aa95fd' '68a32ba14177d4a21c4a9a941cf1d7aea86d436f' 'a800abd3ecb9acc55821f7ac9bba6c956b36a595' '6cc8e7430801fa238bd7d3acae1eb406c6e02fe1'
  # good: [2958a995edc94654df690318df7b9b49e5a3ef88] block/rnbd-clt: Support polling mode for IO latency optimization
  git bisect good 2958a995edc94654df690318df7b9b49e5a3ef88
  # good: [16b3d0cf5bad844daaf436ad2e9061de0fe36e5c] Merge tag 'sched-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
  git bisect good 16b3d0cf5bad844daaf436ad2e9061de0fe36e5c
  # bad: [cbb749cf377aa8aa32a036ebe9dd9f2d89037bf0] block: remove an incorrect check from blk_rq_append_bio
  git bisect bad cbb749cf377aa8aa32a036ebe9dd9f2d89037bf0
  # good: [9bb33f24abbd0fa2fadad01ec75438d7cc239189] block: refactor the bounce buffering code
  git bisect good 9bb33f24abbd0fa2fadad01ec75438d7cc239189
  # bad: [c76f48eb5c084b1e15c931ae8cc1826cd771d70d] block: take bd_mutex around delete_partitions in del_gendisk
  git bisect bad c76f48eb5c084b1e15c931ae8cc1826cd771d70d
  # good: [b896fa85e0ee4f09ba4be48a3f405fc82c38afb4] dasd: use bdev_disk_changed instead of blk_drop_partitions
  git bisect good b896fa85e0ee4f09ba4be48a3f405fc82c38afb4
  # good: [473338be3aaea117a7133720305f240eb7f68951] block: move more syncing and invalidation to delete_partition
  git bisect good 473338be3aaea117a7133720305f240eb7f68951
  # good: [d3c4a43d9291279c28b26757351a6ab72c110753] block: refactor blk_drop_partitions
  git bisect good d3c4a43d9291279c28b26757351a6ab72c110753
  # first bad commit: [c76f48eb5c084b1e15c931ae8cc1826cd771d70d] block: take bd_mutex around delete_partitions in del_gendisk

> 
> Thanks for doing this. I haven't had a chance to retry this commit with
> lockdep but I did re-review it and didn't think that it would be the
> cause of this lockdep report since it strictly reduced the usage of the
> loop_ctl_mutex.

OK. It seems that a patch shown below avoids this warning.

Since I think loop_remove() does not access resources in
loop module, I think we can drop loop_ctl_mutex as soon as
idr_remove(&loop_index_idr, lo->lo_number) completes.

Hmm, what protects lo_open() from loop_exit() destroying everything?
Is module's refcount taken before calling lo_open() ? If yes, does
module_exiting and/or loop_ctl_mutex in loop_exit() help protecting
something? If no, nothing protects and just crashes the kernel?

Is my understanding correct? Do you think this approach is OK?

 drivers/block/loop.c |   32 ++++++++++++++++++++++++++------
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index d58d68f3c7cd..5baa0207512e 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -80,18 +80,19 @@
 #include <linux/ioprio.h>
 #include <linux/blk-cgroup.h>
 
 #include "loop.h"
 
 #include <linux/uaccess.h>
 
 static DEFINE_IDR(loop_index_idr);
 static DEFINE_MUTEX(loop_ctl_mutex);
+static bool module_exiting;
 
 static int max_part;
 static int part_shift;
 
 static int transfer_xor(struct loop_device *lo, int cmd,
 			struct page *raw_page, unsigned raw_off,
 			struct page *loop_page, unsigned loop_off,
 			int size, sector_t real_block)
 {
@@ -1877,25 +1878,29 @@ static int lo_compat_ioctl(struct block_device *bdev, fmode_t mode,
 }
 #endif
 
 static int lo_open(struct block_device *bdev, fmode_t mode)
 {
 	struct loop_device *lo;
 	int err;
 
 	/*
-	 * take loop_ctl_mutex to protect lo pointer from race with
-	 * loop_control_ioctl(LOOP_CTL_REMOVE), however, to reduce contention
-	 * release it prior to updating lo->lo_refcnt.
+	 * Take loop_ctl_mutex to protect lo pointer from race with
+	 * loop_control_ioctl(LOOP_CTL_REMOVE) and loop_exit(). However,
+	 * to reduce contention, release it prior to updating lo->lo_refcnt.
 	 */
 	err = mutex_lock_killable(&loop_ctl_mutex);
 	if (err)
 		return err;
+	if (unlikely(module_exiting)) {
+		mutex_unlock(&loop_ctl_mutex);
+		return -EINVAL;
+	}
 	lo = bdev->bd_disk->private_data;
 	if (!lo) {
 		mutex_unlock(&loop_ctl_mutex);
 		return -ENXIO;
 	}
 	err = mutex_lock_killable(&lo->lo_mutex);
 	mutex_unlock(&loop_ctl_mutex);
 	if (err)
 		return err;
@@ -2073,18 +2078,19 @@ static int loop_init_request(struct blk_mq_tag_set *set, struct request *rq,
 	return 0;
 }
 
 static const struct blk_mq_ops loop_mq_ops = {
 	.queue_rq       = loop_queue_rq,
 	.init_request	= loop_init_request,
 	.complete	= lo_complete_rq,
 };
 
+/* Must be called with loop_ctl_mutex held. */
 static int loop_add(struct loop_device **l, int i)
 {
 	struct loop_device *lo;
 	struct gendisk *disk;
 	int err;
 
 	err = -ENOMEM;
 	lo = kzalloc(sizeof(*lo), GFP_KERNEL);
 	if (!lo)
@@ -2180,18 +2186,19 @@ static int loop_add(struct loop_device **l, int i)
 	blk_mq_free_tag_set(&lo->tag_set);
 out_free_idr:
 	idr_remove(&loop_index_idr, i);
 out_free_dev:
 	kfree(lo);
 out:
 	return err;
 }
 
+/* Must not be called with loop_ctl_mutex or lo->lo_mutex held. */
 static void loop_remove(struct loop_device *lo)
 {
 	del_gendisk(lo->lo_disk);
 	blk_cleanup_queue(lo->lo_queue);
 	blk_mq_free_tag_set(&lo->tag_set);
 	put_disk(lo->lo_disk);
 	mutex_destroy(&lo->lo_mutex);
 	kfree(lo);
 }
@@ -2251,18 +2258,22 @@ static void loop_probe(dev_t dev)
 static long loop_control_ioctl(struct file *file, unsigned int cmd,
 			       unsigned long parm)
 {
 	struct loop_device *lo;
 	int ret;
 
 	ret = mutex_lock_killable(&loop_ctl_mutex);
 	if (ret)
 		return ret;
+	if (unlikely(module_exiting)) {
+		mutex_unlock(&loop_ctl_mutex);
+		return -EINVAL;
+	}
 
 	ret = -ENOSYS;
 	switch (cmd) {
 	case LOOP_CTL_ADD:
 		ret = loop_lookup(&lo, parm);
 		if (ret >= 0) {
 			ret = -EEXIST;
 			break;
 		}
@@ -2282,19 +2293,21 @@ static long loop_control_ioctl(struct file *file, unsigned int cmd,
 		}
 		if (atomic_read(&lo->lo_refcnt) > 0) {
 			ret = -EBUSY;
 			mutex_unlock(&lo->lo_mutex);
 			break;
 		}
 		lo->lo_disk->private_data = NULL;
 		mutex_unlock(&lo->lo_mutex);
 		idr_remove(&loop_index_idr, lo->lo_number);
+		mutex_unlock(&loop_ctl_mutex);
 		loop_remove(lo);
+		return ret;
 		break;
 	case LOOP_CTL_GET_FREE:
 		ret = loop_lookup(&lo, -1);
 		if (ret >= 0)
 			break;
 		ret = loop_add(&lo, -1);
 	}
 	mutex_unlock(&loop_ctl_mutex);
 
@@ -2391,28 +2404,35 @@ static int loop_exit_cb(int id, void *ptr, void *data)
 {
 	struct loop_device *lo = ptr;
 
 	loop_remove(lo);
 	return 0;
 }
 
 static void __exit loop_exit(void)
 {
-	mutex_lock(&loop_ctl_mutex);
+	module_exiting = true;
 
+	/* Block future lo_open()/loop_control_ioctl() callers. */
+	mutex_lock(&loop_ctl_mutex);
+	mutex_unlock(&loop_ctl_mutex);
+	/*
+	 * Since loop_exit_cb() calls loop_remove(), we can not take
+	 * loop_ctl_mutex here. But since module_exiting == true is blocking
+	 * lo_open() and loop_control_ioctl(), it is safe to access
+	 * loop_index_idr without taking loop_ctl_mutex here.
+	 */
 	idr_for_each(&loop_index_idr, &loop_exit_cb, NULL);
 	idr_destroy(&loop_index_idr);
 
 	unregister_blkdev(LOOP_MAJOR, "loop");
 
 	misc_deregister(&loop_misc);
-
-	mutex_unlock(&loop_ctl_mutex);
 }
 
 module_init(loop_init);
 module_exit(loop_exit);
 
 #ifndef MODULE
 static int __init max_loop_setup(char *str)
 {
 	max_loop = simple_strtol(str, NULL, 0);



  reply	other threads:[~2021-06-10 13:37 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-13 17:33 [syzbot] possible deadlock in del_gendisk syzbot
2021-04-13 17:41 ` Steven Rostedt
2021-04-13 17:43   ` Steven Rostedt
2021-04-13 18:24     ` Dmitry Vyukov
2021-04-13 18:40       ` Steven Rostedt
2021-04-13 18:43         ` Steven Rostedt
2021-04-16  7:51           ` Bisections with different bug manifestations Dmitry Vyukov
2021-04-16 13:13             ` Steven Rostedt
2021-04-16 13:26               ` Dmitry Vyukov
2021-04-16 13:48                 ` Dmitry Vyukov
2021-06-07 10:56 ` [syzbot] possible deadlock in del_gendisk Tetsuo Handa
2021-06-09 16:31   ` Tetsuo Handa
2021-06-09 16:46     ` Tyler Hicks
2021-06-10 13:37       ` Tetsuo Handa [this message]
2021-06-11  6:46         ` Tetsuo Handa
2021-06-11 15:14           ` [PATCH] loop: drop loop_ctl_mutex around del_gendisk() in loop_remove() Tetsuo Handa
2021-06-15  5:30             ` Tetsuo Handa
2021-06-15  5:31               ` syzbot
2021-06-15  9:04             ` Jan Kara
2021-06-11 14:47       ` [syzbot] possible deadlock in del_gendisk Tetsuo Handa
2021-06-11 15:11         ` Pavel Tatashin
2021-06-11 15:18           ` Pavel Tatashin
2021-06-11 15:49             ` Tetsuo Handa
2021-06-12  2:35               ` Tetsuo Handa
2021-06-13 11:01                 ` Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=718b44b5-a230-1907-e1e1-9f609cb67322@i-love.sakura.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=hoeppner@linux.ibm.com \
    --cc=linux-block@vger.kernel.org \
    --cc=pasha.tatashin@soleen.com \
    --cc=pvorel@suse.cz \
    --cc=sth@linux.ibm.com \
    --cc=syzbot+61e04e51b7ac86930589@syzkaller.appspotmail.com \
    --cc=tj@kernel.org \
    --cc=tyhicks@linux.microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.