All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@lst.de>
To: Jens Axboe <axboe@kernel.dk>, Josef Bacik <josef@toxicpanda.com>,
	Minchan Kim <minchan@kernel.org>, Nitin Gupta <ngupta@vflare.org>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	Jan Kara <jack@suse.cz>, "Darrick J . Wong" <djwong@kernel.org>,
	Ming Lei <ming.lei@redhat.com>,
	Matteo Croce <mcroce@microsoft.com>,
	linux-block@vger.kernel.org, nbd@other.debian.org,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Subject: [PATCH 13/15] loop: avoid loop_validate_mutex/lo_mutex in ->release
Date: Wed, 30 Mar 2022 07:29:15 +0200	[thread overview]
Message-ID: <20220330052917.2566582-14-hch@lst.de> (raw)
In-Reply-To: <20220330052917.2566582-1-hch@lst.de>

From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>

Since ->release is called with disk->open_mutex held, and __loop_clr_fd()
 from lo_release() is called via ->release when disk_openers() == 0, we are
guaranteed that "struct file" which will be passed to loop_validate_file()
via fget() cannot be the loop device __loop_clr_fd(lo, true) will clear.
Thus, there is no need to hold loop_validate_mutex from __loop_clr_fd()
if release == true.

When I made commit 3ce6e1f662a91097 ("loop: reintroduce global lock for
safe loop_validate_file() traversal"), I wrote "It is acceptable for
loop_validate_file() to succeed, for actual clear operation has not started
yet.". But now I came to feel why it is acceptable to succeed.

It seems that the loop driver was added in Linux 1.3.68, and

  if (lo->lo_refcnt > 1)
    return -EBUSY;

check in loop_clr_fd() was there from the beginning. The intent of this
check was unclear. But now I think that current

  disk_openers(lo->lo_disk) > 1

form is there for three reasons.

(1) Avoid I/O errors when some process which opens and reads from this
    loop device in response to uevent notification (e.g. systemd-udevd),
    as described in commit a1ecac3b0656a682 ("loop: Make explicit loop
    device destruction lazy"). This opener is short-lived because it is
    likely that the file descriptor used by that process is closed soon.

(2) Avoid I/O errors caused by underlying layer of stacked loop devices
    (i.e. ioctl(some_loop_fd, LOOP_SET_FD, other_loop_fd)) being suddenly
    disappeared. This opener is long-lived because this reference is
    associated with not a file descriptor but lo->lo_backing_file.

(3) Avoid I/O errors caused by underlying layer of mounted loop device
    (i.e. mount(some_loop_device, some_mount_point)) being suddenly
    disappeared. This opener is long-lived because this reference is
    associated with not a file descriptor but mount.

While race in (1) might be acceptable, (2) and (3) should be checked
racelessly. That is, make sure that __loop_clr_fd() will not run if
loop_validate_file() succeeds, by doing refcount check with global lock
held when explicit loop device destruction is requested.

As a result of no longer waiting for lo->lo_mutex after setting Lo_rundown,
we can remove pointless BUG_ON(lo->lo_state != Lo_rundown) check.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/block/loop.c | 38 +++++++++++++-------------------------
 1 file changed, 13 insertions(+), 25 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index bfd21af7aa38b..ea3b0e055657f 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1137,27 +1137,6 @@ static void __loop_clr_fd(struct loop_device *lo, bool release)
 	struct file *filp;
 	gfp_t gfp = lo->old_gfp_mask;
 
-	/*
-	 * Flush loop_configure() and loop_change_fd(). It is acceptable for
-	 * loop_validate_file() to succeed, for actual clear operation has not
-	 * started yet.
-	 */
-	mutex_lock(&loop_validate_mutex);
-	mutex_unlock(&loop_validate_mutex);
-	/*
-	 * loop_validate_file() now fails because l->lo_state != Lo_bound
-	 * became visible.
-	 */
-
-	/*
-	 * Since this function is called upon "ioctl(LOOP_CLR_FD)" xor "close()
-	 * after ioctl(LOOP_CLR_FD)", it is a sign of something going wrong if
-	 * lo->lo_state has changed while waiting for lo->lo_mutex.
-	 */
-	mutex_lock(&lo->lo_mutex);
-	BUG_ON(lo->lo_state != Lo_rundown);
-	mutex_unlock(&lo->lo_mutex);
-
 	if (test_bit(QUEUE_FLAG_WC, &lo->lo_queue->queue_flags))
 		blk_queue_write_cache(lo->lo_queue, false, false);
 
@@ -1244,11 +1223,20 @@ static int loop_clr_fd(struct loop_device *lo)
 {
 	int err;
 
-	err = mutex_lock_killable(&lo->lo_mutex);
+	/*
+	 * Since lo_ioctl() is called without locks held, it is possible that
+	 * loop_configure()/loop_change_fd() and loop_clr_fd() run in parallel.
+	 *
+	 * Therefore, use global lock when setting Lo_rundown state in order to
+	 * make sure that loop_validate_file() will fail if the "struct file"
+	 * which loop_configure()/loop_change_fd() found via fget() was this
+	 * loop device.
+	 */
+	err = loop_global_lock_killable(lo, true);
 	if (err)
 		return err;
 	if (lo->lo_state != Lo_bound) {
-		mutex_unlock(&lo->lo_mutex);
+		loop_global_unlock(lo, true);
 		return -ENXIO;
 	}
 	/*
@@ -1263,11 +1251,11 @@ static int loop_clr_fd(struct loop_device *lo)
 	 */
 	if (atomic_read(&lo->lo_refcnt) > 1) {
 		lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
-		mutex_unlock(&lo->lo_mutex);
+		loop_global_unlock(lo, true);
 		return 0;
 	}
 	lo->lo_state = Lo_rundown;
-	mutex_unlock(&lo->lo_mutex);
+	loop_global_unlock(lo, true);
 
 	__loop_clr_fd(lo, false);
 	return 0;
-- 
2.30.2


  parent reply	other threads:[~2022-03-30  5:30 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-30  5:29 yet another approach to fix the loop lock order inversions v6 Christoph Hellwig
2022-03-30  5:29 ` [PATCH 01/15] nbd: use the correct block_device in nbd_bdev_reset Christoph Hellwig
2022-04-18 12:54   ` Jens Axboe
2022-03-30  5:29 ` [PATCH 02/15] zram: cleanup reset_store Christoph Hellwig
2022-03-30  5:29 ` [PATCH 03/15] zram: cleanup zram_remove Christoph Hellwig
2022-03-30  5:29 ` [PATCH 04/15] block: add a disk_openers helper Christoph Hellwig
2022-03-30  5:29 ` [PATCH 05/15] block: turn bdev->bd_openers into an atomic_t Christoph Hellwig
2022-03-30  5:29 ` [PATCH 06/15] loop: de-duplicate the idle worker freeing code Christoph Hellwig
2022-03-30  5:29 ` [PATCH 07/15] loop: initialize the worker tracking fields once Christoph Hellwig
2022-03-30  5:29 ` [PATCH 08/15] loop: remove the racy bd_inode->i_mapping->nrpages asserts Christoph Hellwig
2022-03-30  5:29 ` [PATCH 09/15] loop: don't freeze the queue in lo_release Christoph Hellwig
2022-03-30  5:29 ` [PATCH 10/15] loop: only freeze the queue in __loop_clr_fd when needed Christoph Hellwig
2022-03-30  5:29 ` [PATCH 11/15] loop: implement ->free_disk Christoph Hellwig
2022-03-30  5:29 ` [PATCH 12/15] loop: suppress uevents while reconfiguring the device Christoph Hellwig
2022-03-30  5:29 ` Christoph Hellwig [this message]
2022-03-30  5:29 ` [PATCH 14/15] loop: remove lo_refcount and avoid lo_mutex in ->open / ->release Christoph Hellwig
2022-03-30  5:29 ` [PATCH 15/15] loop: don't destroy lo->workqueue in __loop_clr_fd Christoph Hellwig
2022-04-04  7:42 ` yet another approach to fix the loop lock order inversions v6 Christoph Hellwig
2022-04-04  9:39   ` Tetsuo Handa
2022-04-05  6:28     ` Christoph Hellwig
2022-04-05  6:38       ` Tetsuo Handa
2022-04-05  9:10       ` Jan Kara
2022-04-18  9:39     ` Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220330052917.2566582-14-hch@lst.de \
    --to=hch@lst.de \
    --cc=axboe@kernel.dk \
    --cc=djwong@kernel.org \
    --cc=jack@suse.cz \
    --cc=josef@toxicpanda.com \
    --cc=linux-block@vger.kernel.org \
    --cc=mcroce@microsoft.com \
    --cc=minchan@kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=nbd@other.debian.org \
    --cc=ngupta@vflare.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.