All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@canonical.com>
To: Jens Axboe <axboe@kernel.dk>,
	linux-kernel@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>, Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Jarod Wilson <jarod@redhat.com>,
	David Herrmann <dh.herrmann@gmail.com>,
	Markus Pargmann <mpa@pengutronix.de>,
	nbd-general@lists.sourceforge.net,
	Stefan Haberland <stefan.haberland@de.ibm.com>,
	Sebastian Ott <sebott@linux.vnet.ibm.com>,
	Fabian Frederick <fabf@skynet.be>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-s390@vger.kernel.org, Ming Lei <ming.lei@canonical.com>
Subject: [PATCH v3 2/7] block: loop: don't hold lo_ctl_mutex in lo_open
Date: Wed,  6 May 2015 12:26:23 +0800	[thread overview]
Message-ID: <1430886389-26878-3-git-send-email-ming.lei@canonical.com> (raw)
In-Reply-To: <1430886389-26878-1-git-send-email-ming.lei@canonical.com>

The lo_ctl_mutex is held for running all ioctl handlers, and
in some ioctl handlers, ioctl_by_bdev(BLKRRPART) is called for
rereading partitions, which requires bd_mutex.

So it is easy to cause failure because trylock(bd_mutex) may
fail inside blkdev_reread_part(), and follows the lock context:

blkid or other application:
	->open()
		->mutex_lock(bd_mutex)
		->lo_open()
			->mutex_lock(lo_ctl_mutex)

losetup(set fd ioctl):
	->mutex_lock(lo_ctl_mutex)
	->ioctl_by_bdev(BLKRRPART)
		->trylock(bd_mutex)

This patch trys to eliminate the ABBA lock dependency by removing
lo_ctl_mutext in lo_open() with the following approach:

1) make lo_refcnt as atomic_t and avoid acquiring lo_ctl_mutex in lo_open():
	- for open vs. add/del loop, no any problem because of loop_index_mutex
	- freeze request queue during clr_fd, so I/O can't come until
	  clearing fd is completed, like the effect of holding lo_ctl_mutex
	  in lo_open
	- both open() and release() have been serialized by bd_mutex already

2) don't hold lo_ctl_mutex for decreasing/checking lo_refcnt in
lo_release(), then lo_ctl_mutex is only required for the last release.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 drivers/block/loop.c | 21 ++++++++++++---------
 drivers/block/loop.h |  2 +-
 2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 1bee523..b3e294e 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -831,7 +831,7 @@ static int loop_clr_fd(struct loop_device *lo)
 	 * <dev>/do something like mkfs/losetup -d <dev> causing the losetup -d
 	 * command to fail with EBUSY.
 	 */
-	if (lo->lo_refcnt > 1) {
+	if (atomic_read(&lo->lo_refcnt) > 1) {
 		lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
 		mutex_unlock(&lo->lo_ctl_mutex);
 		return 0;
@@ -840,6 +840,9 @@ static int loop_clr_fd(struct loop_device *lo)
 	if (filp == NULL)
 		return -EINVAL;
 
+	/* freeze request queue during the transition */
+	blk_mq_freeze_queue(lo->lo_queue);
+
 	spin_lock_irq(&lo->lo_lock);
 	lo->lo_state = Lo_rundown;
 	lo->lo_backing_file = NULL;
@@ -871,6 +874,8 @@ static int loop_clr_fd(struct loop_device *lo)
 	lo->lo_state = Lo_unbound;
 	/* This is safe: open() is still holding a reference. */
 	module_put(THIS_MODULE);
+	blk_mq_unfreeze_queue(lo->lo_queue);
+
 	if (lo->lo_flags & LO_FLAGS_PARTSCAN && bdev)
 		ioctl_by_bdev(bdev, BLKRRPART, 0);
 	lo->lo_flags = 0;
@@ -1330,9 +1335,7 @@ static int lo_open(struct block_device *bdev, fmode_t mode)
 		goto out;
 	}
 
-	mutex_lock(&lo->lo_ctl_mutex);
-	lo->lo_refcnt++;
-	mutex_unlock(&lo->lo_ctl_mutex);
+	atomic_inc(&lo->lo_refcnt);
 out:
 	mutex_unlock(&loop_index_mutex);
 	return err;
@@ -1343,11 +1346,10 @@ static void lo_release(struct gendisk *disk, fmode_t mode)
 	struct loop_device *lo = disk->private_data;
 	int err;
 
-	mutex_lock(&lo->lo_ctl_mutex);
-
-	if (--lo->lo_refcnt)
-		goto out;
+	if (atomic_dec_return(&lo->lo_refcnt))
+		return;
 
+	mutex_lock(&lo->lo_ctl_mutex);
 	if (lo->lo_flags & LO_FLAGS_AUTOCLEAR) {
 		/*
 		 * In autoclear mode, stop the loop thread
@@ -1601,6 +1603,7 @@ static int loop_add(struct loop_device **l, int i)
 		disk->flags |= GENHD_FL_NO_PART_SCAN;
 	disk->flags |= GENHD_FL_EXT_DEVT;
 	mutex_init(&lo->lo_ctl_mutex);
+	atomic_set(&lo->lo_refcnt, 0);
 	lo->lo_number		= i;
 	spin_lock_init(&lo->lo_lock);
 	disk->major		= LOOP_MAJOR;
@@ -1718,7 +1721,7 @@ static long loop_control_ioctl(struct file *file, unsigned int cmd,
 			mutex_unlock(&lo->lo_ctl_mutex);
 			break;
 		}
-		if (lo->lo_refcnt > 0) {
+		if (atomic_read(&lo->lo_refcnt) > 0) {
 			ret = -EBUSY;
 			mutex_unlock(&lo->lo_ctl_mutex);
 			break;
diff --git a/drivers/block/loop.h b/drivers/block/loop.h
index 49564ed..25e8997 100644
--- a/drivers/block/loop.h
+++ b/drivers/block/loop.h
@@ -28,7 +28,7 @@ struct loop_func_table;
 
 struct loop_device {
 	int		lo_number;
-	int		lo_refcnt;
+	atomic_t	lo_refcnt;
 	loff_t		lo_offset;
 	loff_t		lo_sizelimit;
 	int		lo_flags;
-- 
1.9.1


  parent reply	other threads:[~2015-05-06  4:26 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-06  4:26 [PATCH v3 0/7] block: reread partitions changes and fix for loop Ming Lei
2015-05-06  4:26 ` [PATCH v3 1/7] block: export blkdev_reread_part() and __blkdev_reread_part() Ming Lei
2015-05-06  4:26 ` Ming Lei [this message]
2015-05-06  4:26 ` [PATCH v3 3/7] block: loop: fix another reread part failure Ming Lei
2015-05-06  4:26 ` [PATCH v3 4/7] block: nbd: convert to blkdev_reread_part() Ming Lei
2015-05-06  4:26 ` [PATCH v3 5/7] block: dasd_genhd: convert to blkdev_reread_part Ming Lei
2015-05-06  4:26 ` [PATCH v3 6/7] block: replace trylock with mutex_lock in blkdev_reread_part() Ming Lei
2015-05-06  4:26 ` [PATCH v3 7/7] s390/block/dasd: remove obsolete while -EBUSY loop Ming Lei
2015-05-18 16:11 ` [PATCH v3 0/7] block: reread partitions changes and fix for loop Christoph Hellwig
2015-05-20  9:17   ` Ming Lei
2015-05-20  9:17     ` Ming Lei
2015-05-20 15:03     ` Jens Axboe
2015-05-20 15:03       ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1430886389-26878-3-git-send-email-ming.lei@canonical.com \
    --to=ming.lei@canonical.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=dh.herrmann@gmail.com \
    --cc=fabf@skynet.be \
    --cc=hch@infradead.org \
    --cc=jarod@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mpa@pengutronix.de \
    --cc=nbd-general@lists.sourceforge.net \
    --cc=peterz@infradead.org \
    --cc=sebott@linux.vnet.ibm.com \
    --cc=stefan.haberland@de.ibm.com \
    --cc=tj@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.