All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: Christoph Hellwig <hch@lst.de>, Jens Axboe <axboe@kernel.dk>
Cc: Jan Kara <jack@suse.cz>,
	linux-block@vger.kernel.org, Ming Lei <ming.lei@redhat.com>
Subject: Re: yet another approach to fix loop autoclear for xfstets xfs/049
Date: Thu, 27 Jan 2022 10:05:45 +0900	[thread overview]
Message-ID: <7bebf860-2415-7eb6-55a1-47dc4439d9e9@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20220126155040.1190842-1-hch@lst.de>

On 2022/01/27 0:50, Christoph Hellwig wrote:
> Hi Jens, hi Tetsuo,
> 
> this series uses the approach from Tetsuo to delay the destroy_workueue
> call, extended by a delayed teardown of the workers to fix a potential
> racewindow then the workqueue can be still round after finishing the
> commands.  It then also removed the queue freeing in release that is
> not needed to fix the dependency chain for that (which can't be
> reported by lockdep) as well.

I want to remove disk->open_mutex => lo->lo_mutex => I/O completion chain itself from
/proc/lockdep . Even if real deadlock does not happen due to lo->lo_refcnt exclusion,
I consider that disk->open_mutex => lo->lo_mutex dependency being recorded is a bad sign.
It makes difficult to find real possibility of deadlock when things change in future.
I consider that lo_release() is doing too much things under disk->open_mutex.

I tried to defer lo->lo_mutex in lo_release() as much as possible. But this chain
still remains.

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index cf7830a68113..a9abd213b38d 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1706,12 +1706,19 @@ static int lo_open(struct block_device *bdev, fmode_t mode)
 	struct loop_device *lo = bdev->bd_disk->private_data;
 	int err;
 
+	/*
+	 * Since lo_open() and lo_release() are serialized by disk->open_mutex,
+	 * we can racelessly increment/decrement lo->lo_refcnt without holding
+	 * lo->lo_mutex.
+	 */
 	if (atomic_inc_return(&lo->lo_refcnt) > 1)
 		return 0;
 
 	err = mutex_lock_killable(&lo->lo_mutex);
-	if (err)
+	if (err) {
+		atomic_dec(&lo->lo_refcnt);
 		return err;
+	}
 	if (lo->lo_state == Lo_deleting) {
 		atomic_dec(&lo->lo_refcnt);
 		err = -ENXIO;
@@ -1727,16 +1734,18 @@ static void lo_release(struct gendisk *disk, fmode_t mode)
 	if (!atomic_dec_and_test(&lo->lo_refcnt))
 		return;
 
-	mutex_lock(&lo->lo_mutex);
-	if (lo->lo_state == Lo_bound &&
-	    (lo->lo_flags & LO_FLAGS_AUTOCLEAR)) {
-		lo->lo_state = Lo_rundown;
-		mutex_unlock(&lo->lo_mutex);
-		__loop_clr_fd(lo, true);
+	/*
+	 * Since lo_open() and lo_release() are serialized by disk->open_mutex,
+	 * and lo->refcnt == 0 means nobody is using this device, we can read
+	 * lo->lo_state and lo->lo_flags without holding lo->lo_mutex.
+	 */
+	if (lo->lo_state != Lo_bound || !(lo->lo_flags & LO_FLAGS_AUTOCLEAR))
 		return;
-	}
-
+	mutex_lock(&lo->lo_mutex);
+	lo->lo_state = Lo_rundown;
 	mutex_unlock(&lo->lo_mutex);
+	__loop_clr_fd(lo, true);
+	return;
 }
 
 static const struct block_device_operations lo_fops = {

In __loop_clr_fd(), it waits for loop_validate_mutex. loop_validate_mutex can be
held when loop_change_fd() calls blk_mq_freeze_queue(). Loop recursion interacts
with other loop devices.

While each loop device takes care of only single backing file, we can use multiple
loop devices for multiple backing files within the same mount point (e.g. /dev/loop0
is attached on /mnt/file0 and /dev/loop1 is attached on /mnt/file1), can't we?
But fsfreeze is per a mount point (e.g. /mnt/). That is, fsfreeze also interacts
with other loop devices, doesn't it?

disk->open_mutex is a per a loop device serialization, but loop_validate_mutex and
fsfreeze are global serialization. It is anxious to involve global serialization under
individual serialization, when we already know that involvement of sysfs + fsfreeze
causes possibility of deadlock. I expect that lo_open()/lo_release() are done without
holding disk->open_mutex.


  parent reply	other threads:[~2022-01-27  1:06 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-26 15:50 yet another approach to fix loop autoclear for xfstets xfs/049 Christoph Hellwig
2022-01-26 15:50 ` [PATCH 1/8] loop: de-duplicate the idle worker freeing code Christoph Hellwig
2022-01-27  9:36   ` Jan Kara
2022-01-26 15:50 ` [PATCH 2/8] loop: initialize the worker tracking fields once Christoph Hellwig
2022-01-27  9:45   ` Jan Kara
2022-01-26 15:50 ` [PATCH 3/8] block: remove the racy bd_inode->i_mapping->nrpages asserts Christoph Hellwig
2022-01-27  9:47   ` Jan Kara
2022-01-27  9:49     ` Christoph Hellwig
2022-01-27 12:23       ` Jan Kara
2022-01-28  7:26         ` Christoph Hellwig
2022-01-28 11:45           ` Jan Kara
2022-01-26 15:50 ` [PATCH 4/8] loop: only take lo_mutex for the last reference in lo_release Christoph Hellwig
2022-01-27  9:48   ` Jan Kara
2022-01-27 10:19     ` Jan Kara
2022-01-27 10:28       ` Christoph Hellwig
2022-01-26 15:50 ` [PATCH 5/8] loop: only take lo_mutex for the first reference in lo_open Christoph Hellwig
2022-01-27 10:28   ` Jan Kara
2022-01-27 10:31     ` Tetsuo Handa
2022-01-26 15:50 ` [PATCH 6/8] loop: don't freeze the queue in lo_release Christoph Hellwig
2022-01-27 10:42   ` Jan Kara
2022-01-28  6:46     ` Christoph Hellwig
2022-01-26 15:50 ` [PATCH 7/8] loop: only freeze the queue in __loop_clr_fd when needed Christoph Hellwig
2022-01-27 11:01   ` Jan Kara
2022-01-28  6:48     ` Christoph Hellwig
2022-01-26 15:50 ` [PATCH 8/8] loop: make autoclear operation synchronous again Christoph Hellwig
2022-01-27 11:04   ` Jan Kara
2022-01-26 19:38 ` yet another approach to fix loop autoclear for xfstets xfs/049 Darrick J. Wong
2022-01-27 16:50   ` Darrick J. Wong
2022-01-27  1:05 ` Tetsuo Handa [this message]
2022-01-28  7:08   ` Christoph Hellwig
2022-01-28  9:52     ` Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7bebf860-2415-7eb6-55a1-47dc4439d9e9@I-love.SAKURA.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-block@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.