* [PATCH 1/7] block: export blkdev_reread_part() and __blkdev_reread_part()
2015-04-08 6:23 ` [PATCH 0/7] block: reread partitions improvements Jarod Wilson
@ 2015-04-08 6:23 ` Jarod Wilson
2015-04-08 14:50 ` Ming Lei
2015-04-08 6:23 ` [PATCH 2/7] block: loop: don't hold lo_ctl_mutex in lo_open Jarod Wilson
` (5 subsequent siblings)
6 siblings, 1 reply; 37+ messages in thread
From: Jarod Wilson @ 2015-04-08 6:23 UTC (permalink / raw)
To: linux-kernel
Cc: Jarod Wilson, Christoph Hellwig, Jens Axboe, Tejun Heo,
Alexander Viro, Markus Pargmann, Stefan Weinhuber,
Stefan Haberland, Sebastian Ott, Fabian Frederick, Ming Lei,
David Herrmann, Mike Galbraith, Andrew Morton, Peter Zijlstra,
nbd-general, linux-s390
This patch exports blkdev_reread_part() for block drivers, also
introduce __blkdev_reread_part(), a lockless version.
For some drivers, such as loop, a reread of partitions can be run
from the release path, and bd_mutex may already be held prior to
calling ioctl_by_bdev(bdev, BLKRRPART, 0), so introduce a lockless
path for use in such cases.
CC: Christoph Hellwig <hch@infradead.org>
CC: Jens Axboe <axboe@kernel.dk>
CC: Tejun Heo <tj@kernel.org>
CC: Alexander Viro <viro@zeniv.linux.org.uk>
CC: Markus Pargmann <mpa@pengutronix.de>
CC: Stefan Weinhuber <wein@de.ibm.com>
CC: Stefan Haberland <stefan.haberland@de.ibm.com>
CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
CC: Fabian Frederick <fabf@skynet.be>
CC: Ming Lei <ming.lei@canonical.com>
CC: David Herrmann <dh.herrmann@gmail.com>
CC: Mike Galbraith <bitbucket@online.de>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Peter Zijlstra <peterz@infradead.org>
CC: nbd-general@lists.sourceforge.net
CC: linux-s390@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
---
block/ioctl.c | 26 +++++++++++++++++++++++---
include/linux/fs.h | 3 +++
2 files changed, 26 insertions(+), 3 deletions(-)
diff --git a/block/ioctl.c b/block/ioctl.c
index 7d8befd..64a4fcb 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -150,21 +150,41 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user
}
}
-static int blkdev_reread_part(struct block_device *bdev)
+/*
+ * This is an exported API for the block driver, and will not
+ * acquire bd_mutex, leaving it up to the caller to handle
+ * any necessary locking.
+ */
+int __blkdev_reread_part(struct block_device *bdev)
{
struct gendisk *disk = bdev->bd_disk;
- int res;
if (!disk_part_scan_enabled(disk) || bdev != bdev->bd_contains)
return -EINVAL;
if (!capable(CAP_SYS_ADMIN))
return -EACCES;
+
+ return rescan_partitions(disk, bdev);
+}
+EXPORT_SYMBOL(__blkdev_reread_part);
+
+/*
+ * This is an exported API for the block driver, and will
+ * acquire bd_mutex. Make sure you aren't calling it with
+ * bd_mutex already held, or we'll return -EBUSY.
+ */
+int blkdev_reread_part(struct block_device *bdev)
+{
+ int res;
+
if (!mutex_trylock(&bdev->bd_mutex))
return -EBUSY;
- res = rescan_partitions(disk, bdev);
+ res = __blkdev_reread_part(bdev);
mutex_unlock(&bdev->bd_mutex);
+
return res;
}
+EXPORT_SYMBOL(blkdev_reread_part);
static int blk_ioctl_discard(struct block_device *bdev, uint64_t start,
uint64_t len, int secure)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f4131e8..11398e3 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2245,6 +2245,9 @@ extern struct block_device *blkdev_get_by_path(const char *path, fmode_t mode,
extern struct block_device *blkdev_get_by_dev(dev_t dev, fmode_t mode,
void *holder);
extern void blkdev_put(struct block_device *bdev, fmode_t mode);
+extern int __blkdev_reread_part(struct block_device *bdev);
+extern int blkdev_reread_part(struct block_device *bdev);
+
#ifdef CONFIG_SYSFS
extern int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk);
extern void bd_unlink_disk_holder(struct block_device *bdev,
--
1.8.3.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [PATCH 1/7] block: export blkdev_reread_part() and __blkdev_reread_part()
2015-04-08 6:23 ` [PATCH 1/7] block: export blkdev_reread_part() and __blkdev_reread_part() Jarod Wilson
@ 2015-04-08 14:50 ` Ming Lei
2015-04-08 15:03 ` Peter Zijlstra
0 siblings, 1 reply; 37+ messages in thread
From: Ming Lei @ 2015-04-08 14:50 UTC (permalink / raw)
To: Jarod Wilson
Cc: Linux Kernel Mailing List, Christoph Hellwig, Jens Axboe,
Tejun Heo, Alexander Viro, Markus Pargmann, Stefan Weinhuber,
Stefan Haberland, Sebastian Ott, Fabian Frederick,
David Herrmann, Mike Galbraith, Andrew Morton, Peter Zijlstra,
nbd-general, linux-s390
On Wed, Apr 8, 2015 at 2:23 PM, Jarod Wilson <jarod@redhat.com> wrote:
> This patch exports blkdev_reread_part() for block drivers, also
> introduce __blkdev_reread_part(), a lockless version.
Generally the term of lockless is used when lock is avoided,
and that isn't the case of __blkdev_reread_part(), because
bd_mutex is definitely required here.
>
> For some drivers, such as loop, a reread of partitions can be run
> from the release path, and bd_mutex may already be held prior to
> calling ioctl_by_bdev(bdev, BLKRRPART, 0), so introduce a lockless
> path for use in such cases.
>
> CC: Christoph Hellwig <hch@infradead.org>
> CC: Jens Axboe <axboe@kernel.dk>
> CC: Tejun Heo <tj@kernel.org>
> CC: Alexander Viro <viro@zeniv.linux.org.uk>
> CC: Markus Pargmann <mpa@pengutronix.de>
> CC: Stefan Weinhuber <wein@de.ibm.com>
> CC: Stefan Haberland <stefan.haberland@de.ibm.com>
> CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
> CC: Fabian Frederick <fabf@skynet.be>
> CC: Ming Lei <ming.lei@canonical.com>
> CC: David Herrmann <dh.herrmann@gmail.com>
> CC: Mike Galbraith <bitbucket@online.de>
> CC: Andrew Morton <akpm@linux-foundation.org>
> CC: Peter Zijlstra <peterz@infradead.org>
> CC: nbd-general@lists.sourceforge.net
> CC: linux-s390@vger.kernel.org
> Signed-off-by: Jarod Wilson <jarod@redhat.com>
> ---
> block/ioctl.c | 26 +++++++++++++++++++++++---
> include/linux/fs.h | 3 +++
> 2 files changed, 26 insertions(+), 3 deletions(-)
>
> diff --git a/block/ioctl.c b/block/ioctl.c
> index 7d8befd..64a4fcb 100644
> --- a/block/ioctl.c
> +++ b/block/ioctl.c
> @@ -150,21 +150,41 @@ static int blkpg_ioctl(struct block_device *bdev, struct blkpg_ioctl_arg __user
> }
> }
>
> -static int blkdev_reread_part(struct block_device *bdev)
> +/*
> + * This is an exported API for the block driver, and will not
> + * acquire bd_mutex, leaving it up to the caller to handle
> + * any necessary locking.
Actually, the function is introduced and should be used in case
that bd_mutex has been held already, such as clearing fd in
loop release().
> + */
> +int __blkdev_reread_part(struct block_device *bdev)
> {
> struct gendisk *disk = bdev->bd_disk;
> - int res;
>
> if (!disk_part_scan_enabled(disk) || bdev != bdev->bd_contains)
> return -EINVAL;
> if (!capable(CAP_SYS_ADMIN))
> return -EACCES;
> +
> + return rescan_partitions(disk, bdev);
> +}
> +EXPORT_SYMBOL(__blkdev_reread_part);
> +
> +/*
> + * This is an exported API for the block driver, and will
> + * acquire bd_mutex. Make sure you aren't calling it with
> + * bd_mutex already held, or we'll return -EBUSY.
Strictly speaking, it should be "Make sure you aren't calling it
with bd_mutex already held in current context".
> + */
> +int blkdev_reread_part(struct block_device *bdev)
> +{
> + int res;
> +
> if (!mutex_trylock(&bdev->bd_mutex))
> return -EBUSY;
> - res = rescan_partitions(disk, bdev);
> + res = __blkdev_reread_part(bdev);
> mutex_unlock(&bdev->bd_mutex);
> +
> return res;
> }
> +EXPORT_SYMBOL(blkdev_reread_part);
>
> static int blk_ioctl_discard(struct block_device *bdev, uint64_t start,
> uint64_t len, int secure)
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index f4131e8..11398e3 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2245,6 +2245,9 @@ extern struct block_device *blkdev_get_by_path(const char *path, fmode_t mode,
> extern struct block_device *blkdev_get_by_dev(dev_t dev, fmode_t mode,
> void *holder);
> extern void blkdev_put(struct block_device *bdev, fmode_t mode);
> +extern int __blkdev_reread_part(struct block_device *bdev);
> +extern int blkdev_reread_part(struct block_device *bdev);
> +
> #ifdef CONFIG_SYSFS
> extern int bd_link_disk_holder(struct block_device *bdev, struct gendisk *disk);
> extern void bd_unlink_disk_holder(struct block_device *bdev,
> --
> 1.8.3.1
>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 1/7] block: export blkdev_reread_part() and __blkdev_reread_part()
2015-04-08 14:50 ` Ming Lei
@ 2015-04-08 15:03 ` Peter Zijlstra
2015-04-08 15:27 ` Jarod Wilson
2015-04-08 15:28 ` Ming Lei
0 siblings, 2 replies; 37+ messages in thread
From: Peter Zijlstra @ 2015-04-08 15:03 UTC (permalink / raw)
To: Ming Lei
Cc: Jarod Wilson, Linux Kernel Mailing List, Christoph Hellwig,
Jens Axboe, Tejun Heo, Alexander Viro, Markus Pargmann,
Stefan Weinhuber, Stefan Haberland, Sebastian Ott,
Fabian Frederick, David Herrmann, Mike Galbraith, Andrew Morton,
nbd-general, linux-s390
On Wed, Apr 08, 2015 at 10:50:56PM +0800, Ming Lei wrote:
> > +/*
> > + * This is an exported API for the block driver, and will not
> > + * acquire bd_mutex, leaving it up to the caller to handle
> > + * any necessary locking.
>
> Actually, the function is introduced and should be used in case
> that bd_mutex has been held already, such as clearing fd in
> loop release().
>
> > + */
> > +int __blkdev_reread_part(struct block_device *bdev)
> > {
> > struct gendisk *disk = bdev->bd_disk;
> >
lockdep_assert_held(&bdev->bd_mutex);
is an excellent means of avoiding that comment and verifying its
actually true :-)
> > if (!disk_part_scan_enabled(disk) || bdev != bdev->bd_contains)
> > return -EINVAL;
> > if (!capable(CAP_SYS_ADMIN))
> > return -EACCES;
> > +
> > + return rescan_partitions(disk, bdev);
> > +}
> > +EXPORT_SYMBOL(__blkdev_reread_part);
> > +
> > +/*
> > + * This is an exported API for the block driver, and will
> > + * acquire bd_mutex. Make sure you aren't calling it with
> > + * bd_mutex already held, or we'll return -EBUSY.
>
> Strictly speaking, it should be "Make sure you aren't calling it
> with bd_mutex already held in current context".
>
> > + */
> > +int blkdev_reread_part(struct block_device *bdev)
> > +{
> > + int res;
> > +
> > if (!mutex_trylock(&bdev->bd_mutex))
> > return -EBUSY;
Is that really needed? It seems rather poor form.
> > - res = rescan_partitions(disk, bdev);
> > + res = __blkdev_reread_part(bdev);
> > mutex_unlock(&bdev->bd_mutex);
> > +
> > return res;
> > }
> > +EXPORT_SYMBOL(blkdev_reread_part);
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 1/7] block: export blkdev_reread_part() and __blkdev_reread_part()
2015-04-08 15:03 ` Peter Zijlstra
@ 2015-04-08 15:27 ` Jarod Wilson
2015-04-08 15:28 ` Ming Lei
1 sibling, 0 replies; 37+ messages in thread
From: Jarod Wilson @ 2015-04-08 15:27 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ming Lei, Linux Kernel Mailing List, Christoph Hellwig,
Jens Axboe, Tejun Heo, Alexander Viro, Markus Pargmann,
Stefan Weinhuber, Stefan Haberland, Sebastian Ott,
Fabian Frederick, David Herrmann, Mike Galbraith, Andrew Morton,
nbd-general, linux-s390
On Wed, Apr 08, 2015 at 05:03:25PM +0200, Peter Zijlstra wrote:
> On Wed, Apr 08, 2015 at 10:50:56PM +0800, Ming Lei wrote:
> > > +/*
> > > + * This is an exported API for the block driver, and will not
> > > + * acquire bd_mutex, leaving it up to the caller to handle
> > > + * any necessary locking.
> >
> > Actually, the function is introduced and should be used in case
> > that bd_mutex has been held already, such as clearing fd in
> > loop release().
> >
> > > + */
> > > +int __blkdev_reread_part(struct block_device *bdev)
> > > {
> > > struct gendisk *disk = bdev->bd_disk;
> > >
>
> lockdep_assert_held(&bdev->bd_mutex);
>
> is an excellent means of avoiding that comment and verifying its
> actually true :-)
Ah, yes, that was actually suggested by Christoph as well, I was too hasty
shoving something back out the door on multiple counts.
> > > + */
> > > +int blkdev_reread_part(struct block_device *bdev)
> > > +{
> > > + int res;
> > > +
> > > if (!mutex_trylock(&bdev->bd_mutex))
> > > return -EBUSY;
>
> Is that really needed? It seems rather poor form.
It goes away later in the series and gets converted to a straight
mutex_lock().
--
Jarod Wilson
jarod@redhat.com
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 1/7] block: export blkdev_reread_part() and __blkdev_reread_part()
2015-04-08 15:03 ` Peter Zijlstra
2015-04-08 15:27 ` Jarod Wilson
@ 2015-04-08 15:28 ` Ming Lei
1 sibling, 0 replies; 37+ messages in thread
From: Ming Lei @ 2015-04-08 15:28 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Jarod Wilson, Linux Kernel Mailing List, Christoph Hellwig,
Jens Axboe, Tejun Heo, Alexander Viro, Markus Pargmann,
Stefan Weinhuber, Stefan Haberland, Sebastian Ott,
Fabian Frederick, David Herrmann, Mike Galbraith, Andrew Morton,
nbd-general, linux-s390
On Wed, Apr 8, 2015 at 11:03 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Wed, Apr 08, 2015 at 10:50:56PM +0800, Ming Lei wrote:
>> > +/*
>> > + * This is an exported API for the block driver, and will not
>> > + * acquire bd_mutex, leaving it up to the caller to handle
>> > + * any necessary locking.
>>
>> Actually, the function is introduced and should be used in case
>> that bd_mutex has been held already, such as clearing fd in
>> loop release().
>>
>> > + */
>> > +int __blkdev_reread_part(struct block_device *bdev)
>> > {
>> > struct gendisk *disk = bdev->bd_disk;
>> >
>
> lockdep_assert_held(&bdev->bd_mutex);
>
> is an excellent means of avoiding that comment and verifying its
> actually true :-)
Exactly, and it has been added in my local tree, :-)
>
>> > if (!disk_part_scan_enabled(disk) || bdev != bdev->bd_contains)
>> > return -EINVAL;
>> > if (!capable(CAP_SYS_ADMIN))
>> > return -EACCES;
>> > +
>> > + return rescan_partitions(disk, bdev);
>> > +}
>> > +EXPORT_SYMBOL(__blkdev_reread_part);
>> > +
>> > +/*
>> > + * This is an exported API for the block driver, and will
>> > + * acquire bd_mutex. Make sure you aren't calling it with
>> > + * bd_mutex already held, or we'll return -EBUSY.
>>
>> Strictly speaking, it should be "Make sure you aren't calling it
>> with bd_mutex already held in current context".
>>
>> > + */
>> > +int blkdev_reread_part(struct block_device *bdev)
>> > +{
>> > + int res;
>> > +
>> > if (!mutex_trylock(&bdev->bd_mutex))
>> > return -EBUSY;
>
> Is that really needed? It seems rather poor form.
The trylock will be replaced with mutex_lock in patch 6.
>
>> > - res = rescan_partitions(disk, bdev);
>> > + res = __blkdev_reread_part(bdev);
>> > mutex_unlock(&bdev->bd_mutex);
>> > +
>> > return res;
>> > }
>> > +EXPORT_SYMBOL(blkdev_reread_part);
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 2/7] block: loop: don't hold lo_ctl_mutex in lo_open
2015-04-08 6:23 ` [PATCH 0/7] block: reread partitions improvements Jarod Wilson
2015-04-08 6:23 ` [PATCH 1/7] block: export blkdev_reread_part() and __blkdev_reread_part() Jarod Wilson
@ 2015-04-08 6:23 ` Jarod Wilson
2015-04-08 6:50 ` Ming Lei
2015-04-08 6:23 ` [PATCH 3/7] block: loop: fix another reread part failure Jarod Wilson
` (4 subsequent siblings)
6 siblings, 1 reply; 37+ messages in thread
From: Jarod Wilson @ 2015-04-08 6:23 UTC (permalink / raw)
To: linux-kernel
Cc: Ming Lei, Christoph Hellwig, Jens Axboe, Tejun Heo,
Alexander Viro, Markus Pargmann, Stefan Weinhuber,
Stefan Haberland, Sebastian Ott, Fabian Frederick,
David Herrmann, Mike Galbraith, Andrew Morton, Peter Zijlstra,
nbd-general, linux-s390, Jarod Wilson
From: Ming Lei <ming.lei@canonical.com>
The lo_ctl_mutex is held for running all ioctl handlers, and
in some ioctl handlers, ioctl_by_bdev(BLKRRPART) is called for
rereading partitions, which requires bd_mutex.
So it is easy to cause failure because trylock(bd_mutex) may
fail inside blkdev_reread_part(), and follows the lock context:
blkid or other application:
->open()
->mutex_lock(bd_mutex)
->lo_open()
->mutex_lock(lo_ctl_mutex)
losetup(set fd ioctl):
->mutex_lock(lo_ctl_mutex)
->ioctl_by_bdev(BLKRRPART)
->trylock(bd_mutex)
This patch trys to eliminate the ABBA lock dependency by removing
lo_ctl_mutext in lo_open() with the following approach:
1) introduce lo_open_mutex to protect lo_refcnt and avoid acquiring
lo_ctl_mutex in lo_open():
- for open vs. add/del loop, no any problem because of loop_index_mutex
- lo_open_mutex is used for syncing open() and loop_clr_fd()
- both open() and release() have been serialized by bd_mutex already
2) don't hold lo_ctl_mutex for decreasing/checking lo_refcnt in
lo_release(), then lo_ctl_mutex is only required for the last release.
CC: Christoph Hellwig <hch@infradead.org>
CC: Jens Axboe <axboe@kernel.dk>
CC: Tejun Heo <tj@kernel.org>
CC: Alexander Viro <viro@zeniv.linux.org.uk>
CC: Markus Pargmann <mpa@pengutronix.de>
CC: Stefan Weinhuber <wein@de.ibm.com>
CC: Stefan Haberland <stefan.haberland@de.ibm.com>
CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
CC: Fabian Frederick <fabf@skynet.be>
CC: Ming Lei <ming.lei@canonical.com>
CC: David Herrmann <dh.herrmann@gmail.com>
CC: Mike Galbraith <bitbucket@online.de>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Peter Zijlstra <peterz@infradead.org>
CC: nbd-general@lists.sourceforge.net
CC: linux-s390@vger.kernel.org
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
---
drivers/block/loop.c | 32 ++++++++++++++++++++++++++------
drivers/block/loop.h | 1 +
2 files changed, 27 insertions(+), 6 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index d1f168b..81a6bc1 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -879,14 +879,18 @@ static int loop_clr_fd(struct loop_device *lo)
* <dev>/do something like mkfs/losetup -d <dev> causing the losetup -d
* command to fail with EBUSY.
*/
+ mutex_lock(&lo->lo_open_mutex);
if (lo->lo_refcnt > 1) {
+ mutex_unlock(&lo->lo_open_mutex);
lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
mutex_unlock(&lo->lo_ctl_mutex);
return 0;
}
- if (filp == NULL)
+ if (filp == NULL) {
+ mutex_unlock(&lo->lo_open_mutex);
return -EINVAL;
+ }
spin_lock_irq(&lo->lo_lock);
lo->lo_state = Lo_rundown;
@@ -919,6 +923,15 @@ static int loop_clr_fd(struct loop_device *lo)
lo->lo_state = Lo_unbound;
/* This is safe: open() is still holding a reference. */
module_put(THIS_MODULE);
+
+ /*
+ * Unlock open_mutex for avoiding -EBUSY of rereading part:
+ * - try to acquire bd_mutex from reread part
+ * - another task is opening the loop with holding bd_mutex
+ * and trys to acquire open_mutex
+ */
+ mutex_unlock(&lo->lo_open_mutex);
+
if (lo->lo_flags & LO_FLAGS_PARTSCAN && bdev)
ioctl_by_bdev(bdev, BLKRRPART, 0);
lo->lo_flags = 0;
@@ -1376,9 +1389,9 @@ static int lo_open(struct block_device *bdev, fmode_t mode)
goto out;
}
- mutex_lock(&lo->lo_ctl_mutex);
+ mutex_lock(&lo->lo_open_mutex);
lo->lo_refcnt++;
- mutex_unlock(&lo->lo_ctl_mutex);
+ mutex_unlock(&lo->lo_open_mutex);
out:
mutex_unlock(&loop_index_mutex);
return err;
@@ -1387,13 +1400,16 @@ out:
static void lo_release(struct gendisk *disk, fmode_t mode)
{
struct loop_device *lo = disk->private_data;
- int err;
+ int err, ref;
- mutex_lock(&lo->lo_ctl_mutex);
+ mutex_lock(&lo->lo_open_mutex);
+ ref = --lo->lo_refcnt;
+ mutex_unlock(&lo->lo_open_mutex);
- if (--lo->lo_refcnt)
+ if (ref)
goto out;
+ mutex_lock(&lo->lo_ctl_mutex);
if (lo->lo_flags & LO_FLAGS_AUTOCLEAR) {
/*
* In autoclear mode, stop the loop thread
@@ -1646,6 +1662,7 @@ static int loop_add(struct loop_device **l, int i)
disk->flags |= GENHD_FL_NO_PART_SCAN;
disk->flags |= GENHD_FL_EXT_DEVT;
mutex_init(&lo->lo_ctl_mutex);
+ mutex_init(&lo->lo_open_mutex);
lo->lo_number = i;
spin_lock_init(&lo->lo_lock);
disk->major = LOOP_MAJOR;
@@ -1763,11 +1780,14 @@ static long loop_control_ioctl(struct file *file, unsigned int cmd,
mutex_unlock(&lo->lo_ctl_mutex);
break;
}
+ mutex_lock(&lo->lo_open_mutex);
if (lo->lo_refcnt > 0) {
ret = -EBUSY;
+ mutex_unlock(&lo->lo_open_mutex);
mutex_unlock(&lo->lo_ctl_mutex);
break;
}
+ mutex_unlock(&lo->lo_open_mutex);
lo->lo_disk->private_data = NULL;
mutex_unlock(&lo->lo_ctl_mutex);
idr_remove(&loop_index_idr, lo->lo_number);
diff --git a/drivers/block/loop.h b/drivers/block/loop.h
index 301c27f..1b4acf2 100644
--- a/drivers/block/loop.h
+++ b/drivers/block/loop.h
@@ -59,6 +59,7 @@ struct loop_device {
bool write_started;
int lo_state;
struct mutex lo_ctl_mutex;
+ struct mutex lo_open_mutex;
struct request_queue *lo_queue;
struct blk_mq_tag_set tag_set;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [PATCH 2/7] block: loop: don't hold lo_ctl_mutex in lo_open
2015-04-08 6:23 ` [PATCH 2/7] block: loop: don't hold lo_ctl_mutex in lo_open Jarod Wilson
@ 2015-04-08 6:50 ` Ming Lei
2015-04-08 13:40 ` Jarod Wilson
0 siblings, 1 reply; 37+ messages in thread
From: Ming Lei @ 2015-04-08 6:50 UTC (permalink / raw)
To: Jarod Wilson
Cc: Linux Kernel Mailing List, Christoph Hellwig, Jens Axboe,
Tejun Heo, Alexander Viro, Markus Pargmann, Stefan Weinhuber,
Stefan Haberland, Sebastian Ott, Fabian Frederick,
David Herrmann, Mike Galbraith, Andrew Morton, Peter Zijlstra,
nbd-general, linux-s390
Hi Jarod,
On Wed, Apr 8, 2015 at 2:23 PM, Jarod Wilson <jarod@redhat.com> wrote:
> From: Ming Lei <ming.lei@canonical.com>
>
> The lo_ctl_mutex is held for running all ioctl handlers, and
> in some ioctl handlers, ioctl_by_bdev(BLKRRPART) is called for
> rereading partitions, which requires bd_mutex.
>
> So it is easy to cause failure because trylock(bd_mutex) may
> fail inside blkdev_reread_part(), and follows the lock context:
>
> blkid or other application:
> ->open()
> ->mutex_lock(bd_mutex)
> ->lo_open()
> ->mutex_lock(lo_ctl_mutex)
>
> losetup(set fd ioctl):
> ->mutex_lock(lo_ctl_mutex)
> ->ioctl_by_bdev(BLKRRPART)
> ->trylock(bd_mutex)
>
> This patch trys to eliminate the ABBA lock dependency by removing
> lo_ctl_mutext in lo_open() with the following approach:
>
> 1) introduce lo_open_mutex to protect lo_refcnt and avoid acquiring
> lo_ctl_mutex in lo_open():
It is a bit quick since I said the lo_open_mutex can be removed,
and Christoph agreed that too.
So looks we still need to post another version, :-)
> - for open vs. add/del loop, no any problem because of loop_index_mutex
> - lo_open_mutex is used for syncing open() and loop_clr_fd()
> - both open() and release() have been serialized by bd_mutex already
>
> 2) don't hold lo_ctl_mutex for decreasing/checking lo_refcnt in
> lo_release(), then lo_ctl_mutex is only required for the last release.
>
> CC: Christoph Hellwig <hch@infradead.org>
> CC: Jens Axboe <axboe@kernel.dk>
> CC: Tejun Heo <tj@kernel.org>
> CC: Alexander Viro <viro@zeniv.linux.org.uk>
> CC: Markus Pargmann <mpa@pengutronix.de>
> CC: Stefan Weinhuber <wein@de.ibm.com>
> CC: Stefan Haberland <stefan.haberland@de.ibm.com>
> CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
> CC: Fabian Frederick <fabf@skynet.be>
> CC: Ming Lei <ming.lei@canonical.com>
> CC: David Herrmann <dh.herrmann@gmail.com>
> CC: Mike Galbraith <bitbucket@online.de>
> CC: Andrew Morton <akpm@linux-foundation.org>
> CC: Peter Zijlstra <peterz@infradead.org>
> CC: nbd-general@lists.sourceforge.net
> CC: linux-s390@vger.kernel.org
> Signed-off-by: Ming Lei <ming.lei@canonical.com>
> Signed-off-by: Jarod Wilson <jarod@redhat.com>
> ---
> drivers/block/loop.c | 32 ++++++++++++++++++++++++++------
> drivers/block/loop.h | 1 +
> 2 files changed, 27 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index d1f168b..81a6bc1 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -879,14 +879,18 @@ static int loop_clr_fd(struct loop_device *lo)
> * <dev>/do something like mkfs/losetup -d <dev> causing the losetup -d
> * command to fail with EBUSY.
> */
> + mutex_lock(&lo->lo_open_mutex);
> if (lo->lo_refcnt > 1) {
> + mutex_unlock(&lo->lo_open_mutex);
> lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
> mutex_unlock(&lo->lo_ctl_mutex);
> return 0;
> }
>
> - if (filp == NULL)
> + if (filp == NULL) {
> + mutex_unlock(&lo->lo_open_mutex);
> return -EINVAL;
> + }
>
> spin_lock_irq(&lo->lo_lock);
> lo->lo_state = Lo_rundown;
> @@ -919,6 +923,15 @@ static int loop_clr_fd(struct loop_device *lo)
> lo->lo_state = Lo_unbound;
> /* This is safe: open() is still holding a reference. */
> module_put(THIS_MODULE);
> +
> + /*
> + * Unlock open_mutex for avoiding -EBUSY of rereading part:
> + * - try to acquire bd_mutex from reread part
> + * - another task is opening the loop with holding bd_mutex
> + * and trys to acquire open_mutex
> + */
> + mutex_unlock(&lo->lo_open_mutex);
> +
> if (lo->lo_flags & LO_FLAGS_PARTSCAN && bdev)
> ioctl_by_bdev(bdev, BLKRRPART, 0);
> lo->lo_flags = 0;
> @@ -1376,9 +1389,9 @@ static int lo_open(struct block_device *bdev, fmode_t mode)
> goto out;
> }
>
> - mutex_lock(&lo->lo_ctl_mutex);
> + mutex_lock(&lo->lo_open_mutex);
> lo->lo_refcnt++;
> - mutex_unlock(&lo->lo_ctl_mutex);
> + mutex_unlock(&lo->lo_open_mutex);
> out:
> mutex_unlock(&loop_index_mutex);
> return err;
> @@ -1387,13 +1400,16 @@ out:
> static void lo_release(struct gendisk *disk, fmode_t mode)
> {
> struct loop_device *lo = disk->private_data;
> - int err;
> + int err, ref;
>
> - mutex_lock(&lo->lo_ctl_mutex);
> + mutex_lock(&lo->lo_open_mutex);
> + ref = --lo->lo_refcnt;
> + mutex_unlock(&lo->lo_open_mutex);
>
> - if (--lo->lo_refcnt)
> + if (ref)
> goto out;
>
> + mutex_lock(&lo->lo_ctl_mutex);
> if (lo->lo_flags & LO_FLAGS_AUTOCLEAR) {
> /*
> * In autoclear mode, stop the loop thread
> @@ -1646,6 +1662,7 @@ static int loop_add(struct loop_device **l, int i)
> disk->flags |= GENHD_FL_NO_PART_SCAN;
> disk->flags |= GENHD_FL_EXT_DEVT;
> mutex_init(&lo->lo_ctl_mutex);
> + mutex_init(&lo->lo_open_mutex);
> lo->lo_number = i;
> spin_lock_init(&lo->lo_lock);
> disk->major = LOOP_MAJOR;
> @@ -1763,11 +1780,14 @@ static long loop_control_ioctl(struct file *file, unsigned int cmd,
> mutex_unlock(&lo->lo_ctl_mutex);
> break;
> }
> + mutex_lock(&lo->lo_open_mutex);
> if (lo->lo_refcnt > 0) {
> ret = -EBUSY;
> + mutex_unlock(&lo->lo_open_mutex);
> mutex_unlock(&lo->lo_ctl_mutex);
> break;
> }
> + mutex_unlock(&lo->lo_open_mutex);
> lo->lo_disk->private_data = NULL;
> mutex_unlock(&lo->lo_ctl_mutex);
> idr_remove(&loop_index_idr, lo->lo_number);
> diff --git a/drivers/block/loop.h b/drivers/block/loop.h
> index 301c27f..1b4acf2 100644
> --- a/drivers/block/loop.h
> +++ b/drivers/block/loop.h
> @@ -59,6 +59,7 @@ struct loop_device {
> bool write_started;
> int lo_state;
> struct mutex lo_ctl_mutex;
> + struct mutex lo_open_mutex;
>
> struct request_queue *lo_queue;
> struct blk_mq_tag_set tag_set;
> --
> 1.8.3.1
>
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2/7] block: loop: don't hold lo_ctl_mutex in lo_open
2015-04-08 6:50 ` Ming Lei
@ 2015-04-08 13:40 ` Jarod Wilson
2015-04-08 14:00 ` Jarod Wilson
0 siblings, 1 reply; 37+ messages in thread
From: Jarod Wilson @ 2015-04-08 13:40 UTC (permalink / raw)
To: Ming Lei
Cc: Linux Kernel Mailing List, Christoph Hellwig, Jens Axboe,
Tejun Heo, Alexander Viro, Markus Pargmann, Stefan Weinhuber,
Stefan Haberland, Sebastian Ott, Fabian Frederick,
David Herrmann, Mike Galbraith, Andrew Morton, Peter Zijlstra,
nbd-general, linux-s390
On Wed, Apr 08, 2015 at 02:50:59PM +0800, Ming Lei wrote:
> Hi Jarod,
>
> On Wed, Apr 8, 2015 at 2:23 PM, Jarod Wilson <jarod@redhat.com> wrote:
> > From: Ming Lei <ming.lei@canonical.com>
> >
> > The lo_ctl_mutex is held for running all ioctl handlers, and
> > in some ioctl handlers, ioctl_by_bdev(BLKRRPART) is called for
> > rereading partitions, which requires bd_mutex.
> >
> > So it is easy to cause failure because trylock(bd_mutex) may
> > fail inside blkdev_reread_part(), and follows the lock context:
> >
> > blkid or other application:
> > ->open()
> > ->mutex_lock(bd_mutex)
> > ->lo_open()
> > ->mutex_lock(lo_ctl_mutex)
> >
> > losetup(set fd ioctl):
> > ->mutex_lock(lo_ctl_mutex)
> > ->ioctl_by_bdev(BLKRRPART)
> > ->trylock(bd_mutex)
> >
> > This patch trys to eliminate the ABBA lock dependency by removing
> > lo_ctl_mutext in lo_open() with the following approach:
> >
> > 1) introduce lo_open_mutex to protect lo_refcnt and avoid acquiring
> > lo_ctl_mutex in lo_open():
>
> It is a bit quick since I said the lo_open_mutex can be removed,
> and Christoph agreed that too.
>
> So looks we still need to post another version, :-)
Ah. I missed that bit. Just trying to keep up momentum.
--
Jarod Wilson
jarod@redhat.com
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2/7] block: loop: don't hold lo_ctl_mutex in lo_open
2015-04-08 13:40 ` Jarod Wilson
@ 2015-04-08 14:00 ` Jarod Wilson
2015-04-08 14:20 ` Ming Lei
0 siblings, 1 reply; 37+ messages in thread
From: Jarod Wilson @ 2015-04-08 14:00 UTC (permalink / raw)
To: Ming Lei
Cc: Linux Kernel Mailing List, Christoph Hellwig, Jens Axboe,
Tejun Heo, Alexander Viro, Markus Pargmann, Stefan Weinhuber,
Stefan Haberland, Sebastian Ott, Fabian Frederick,
David Herrmann, Mike Galbraith, Andrew Morton, Peter Zijlstra,
nbd-general, linux-s390
On Wed, Apr 08, 2015 at 09:40:34AM -0400, Jarod Wilson wrote:
> On Wed, Apr 08, 2015 at 02:50:59PM +0800, Ming Lei wrote:
> > Hi Jarod,
> >
> > On Wed, Apr 8, 2015 at 2:23 PM, Jarod Wilson <jarod@redhat.com> wrote:
> > > From: Ming Lei <ming.lei@canonical.com>
> > >
> > > The lo_ctl_mutex is held for running all ioctl handlers, and
> > > in some ioctl handlers, ioctl_by_bdev(BLKRRPART) is called for
> > > rereading partitions, which requires bd_mutex.
> > >
> > > So it is easy to cause failure because trylock(bd_mutex) may
> > > fail inside blkdev_reread_part(), and follows the lock context:
> > >
> > > blkid or other application:
> > > ->open()
> > > ->mutex_lock(bd_mutex)
> > > ->lo_open()
> > > ->mutex_lock(lo_ctl_mutex)
> > >
> > > losetup(set fd ioctl):
> > > ->mutex_lock(lo_ctl_mutex)
> > > ->ioctl_by_bdev(BLKRRPART)
> > > ->trylock(bd_mutex)
> > >
> > > This patch trys to eliminate the ABBA lock dependency by removing
> > > lo_ctl_mutext in lo_open() with the following approach:
> > >
> > > 1) introduce lo_open_mutex to protect lo_refcnt and avoid acquiring
> > > lo_ctl_mutex in lo_open():
> >
> > It is a bit quick since I said the lo_open_mutex can be removed,
> > and Christoph agreed that too.
> >
> > So looks we still need to post another version, :-)
>
> Ah. I missed that bit. Just trying to keep up momentum.
Are you working on that atomic_t version then, or should I dive into it?
--
Jarod Wilson
jarod@redhat.com
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2/7] block: loop: don't hold lo_ctl_mutex in lo_open
2015-04-08 14:00 ` Jarod Wilson
@ 2015-04-08 14:20 ` Ming Lei
2015-04-08 15:28 ` Jarod Wilson
0 siblings, 1 reply; 37+ messages in thread
From: Ming Lei @ 2015-04-08 14:20 UTC (permalink / raw)
To: Jarod Wilson
Cc: Linux Kernel Mailing List, Christoph Hellwig, Jens Axboe,
Tejun Heo, Alexander Viro, Markus Pargmann, Stefan Weinhuber,
Stefan Haberland, Sebastian Ott, Fabian Frederick,
David Herrmann, Mike Galbraith, Andrew Morton, Peter Zijlstra,
nbd-general, linux-s390
On Wed, Apr 8, 2015 at 10:00 PM, Jarod Wilson <jarod@redhat.com> wrote:
> On Wed, Apr 08, 2015 at 09:40:34AM -0400, Jarod Wilson wrote:
>> On Wed, Apr 08, 2015 at 02:50:59PM +0800, Ming Lei wrote:
>> > Hi Jarod,
>> >
>> > On Wed, Apr 8, 2015 at 2:23 PM, Jarod Wilson <jarod@redhat.com> wrote:
>> > > From: Ming Lei <ming.lei@canonical.com>
>> > >
>> > > The lo_ctl_mutex is held for running all ioctl handlers, and
>> > > in some ioctl handlers, ioctl_by_bdev(BLKRRPART) is called for
>> > > rereading partitions, which requires bd_mutex.
>> > >
>> > > So it is easy to cause failure because trylock(bd_mutex) may
>> > > fail inside blkdev_reread_part(), and follows the lock context:
>> > >
>> > > blkid or other application:
>> > > ->open()
>> > > ->mutex_lock(bd_mutex)
>> > > ->lo_open()
>> > > ->mutex_lock(lo_ctl_mutex)
>> > >
>> > > losetup(set fd ioctl):
>> > > ->mutex_lock(lo_ctl_mutex)
>> > > ->ioctl_by_bdev(BLKRRPART)
>> > > ->trylock(bd_mutex)
>> > >
>> > > This patch trys to eliminate the ABBA lock dependency by removing
>> > > lo_ctl_mutext in lo_open() with the following approach:
>> > >
>> > > 1) introduce lo_open_mutex to protect lo_refcnt and avoid acquiring
>> > > lo_ctl_mutex in lo_open():
>> >
>> > It is a bit quick since I said the lo_open_mutex can be removed,
>> > and Christoph agreed that too.
>> >
>> > So looks we still need to post another version, :-)
>>
>> Ah. I missed that bit. Just trying to keep up momentum.
>
> Are you working on that atomic_t version then, or should I dive into it?
It has been ready, and I will post them out once it passes my tests.
Thanks,
^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 2/7] block: loop: don't hold lo_ctl_mutex in lo_open
2015-04-08 14:20 ` Ming Lei
@ 2015-04-08 15:28 ` Jarod Wilson
0 siblings, 0 replies; 37+ messages in thread
From: Jarod Wilson @ 2015-04-08 15:28 UTC (permalink / raw)
To: Ming Lei
Cc: Linux Kernel Mailing List, Christoph Hellwig, Jens Axboe,
Tejun Heo, Alexander Viro, Markus Pargmann, Stefan Weinhuber,
Stefan Haberland, Sebastian Ott, Fabian Frederick,
David Herrmann, Mike Galbraith, Andrew Morton, Peter Zijlstra,
nbd-general, linux-s390
On Wed, Apr 08, 2015 at 10:20:22PM +0800, Ming Lei wrote:
> On Wed, Apr 8, 2015 at 10:00 PM, Jarod Wilson <jarod@redhat.com> wrote:
> > On Wed, Apr 08, 2015 at 09:40:34AM -0400, Jarod Wilson wrote:
> >> On Wed, Apr 08, 2015 at 02:50:59PM +0800, Ming Lei wrote:
> >> > Hi Jarod,
> >> >
> >> > On Wed, Apr 8, 2015 at 2:23 PM, Jarod Wilson <jarod@redhat.com> wrote:
> >> > > From: Ming Lei <ming.lei@canonical.com>
> >> > >
> >> > > The lo_ctl_mutex is held for running all ioctl handlers, and
> >> > > in some ioctl handlers, ioctl_by_bdev(BLKRRPART) is called for
> >> > > rereading partitions, which requires bd_mutex.
> >> > >
> >> > > So it is easy to cause failure because trylock(bd_mutex) may
> >> > > fail inside blkdev_reread_part(), and follows the lock context:
> >> > >
> >> > > blkid or other application:
> >> > > ->open()
> >> > > ->mutex_lock(bd_mutex)
> >> > > ->lo_open()
> >> > > ->mutex_lock(lo_ctl_mutex)
> >> > >
> >> > > losetup(set fd ioctl):
> >> > > ->mutex_lock(lo_ctl_mutex)
> >> > > ->ioctl_by_bdev(BLKRRPART)
> >> > > ->trylock(bd_mutex)
> >> > >
> >> > > This patch trys to eliminate the ABBA lock dependency by removing
> >> > > lo_ctl_mutext in lo_open() with the following approach:
> >> > >
> >> > > 1) introduce lo_open_mutex to protect lo_refcnt and avoid acquiring
> >> > > lo_ctl_mutex in lo_open():
> >> >
> >> > It is a bit quick since I said the lo_open_mutex can be removed,
> >> > and Christoph agreed that too.
> >> >
> >> > So looks we still need to post another version, :-)
> >>
> >> Ah. I missed that bit. Just trying to keep up momentum.
> >
> > Are you working on that atomic_t version then, or should I dive into it?
>
> It has been ready, and I will post them out once it passes my tests.
Okay, I'll hold tight then. Sorry for the noise. :p
--
Jarod Wilson
jarod@redhat.com
^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH 3/7] block: loop: fix another reread part failure
2015-04-08 6:23 ` [PATCH 0/7] block: reread partitions improvements Jarod Wilson
2015-04-08 6:23 ` [PATCH 1/7] block: export blkdev_reread_part() and __blkdev_reread_part() Jarod Wilson
2015-04-08 6:23 ` [PATCH 2/7] block: loop: don't hold lo_ctl_mutex in lo_open Jarod Wilson
@ 2015-04-08 6:23 ` Jarod Wilson
2015-04-08 6:23 ` [PATCH 4/7] block: nbd: convert to blkdev_reread_part() Jarod Wilson
` (3 subsequent siblings)
6 siblings, 0 replies; 37+ messages in thread
From: Jarod Wilson @ 2015-04-08 6:23 UTC (permalink / raw)
To: linux-kernel
Cc: Jarod Wilson, Christoph Hellwig, Jens Axboe, Tejun Heo,
Alexander Viro, Markus Pargmann, Stefan Weinhuber,
Stefan Haberland, Sebastian Ott, Fabian Frederick, Ming Lei,
David Herrmann, Mike Galbraith, Andrew Morton, Peter Zijlstra,
nbd-general, linux-s390
loop_clr_fd() can be run piggyback with lo_release(), and
under this situation, reread partition may always fail because
of bd_mutex which has been held in release path.
This patch detects the situation by the reference count, and
call blkdev_reread_part_nolock() to avoid acquiring the lock again.
In the meantime, this patch switches to new kernel APIs
of blkdev_reread_part() and blkdev_reread_part_nolock().
[jarod: this is a merger of my original patch, Ming's patch, and some
reworking for my reworked first patch in the series.]
CC: Christoph Hellwig <hch@infradead.org>
CC: Jens Axboe <axboe@kernel.dk>
CC: Tejun Heo <tj@kernel.org>
CC: Alexander Viro <viro@zeniv.linux.org.uk>
CC: Markus Pargmann <mpa@pengutronix.de>
CC: Stefan Weinhuber <wein@de.ibm.com>
CC: Stefan Haberland <stefan.haberland@de.ibm.com>
CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
CC: Fabian Frederick <fabf@skynet.be>
CC: Ming Lei <ming.lei@canonical.com>
CC: David Herrmann <dh.herrmann@gmail.com>
CC: Mike Galbraith <bitbucket@online.de>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Peter Zijlstra <peterz@infradead.org>
CC: nbd-general@lists.sourceforge.net
CC: linux-s390@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
---
drivers/block/loop.c | 28 ++++++++++++++++++++++++----
1 file changed, 24 insertions(+), 4 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 81a6bc1..ab5c678 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -528,6 +528,26 @@ static int loop_flush(struct loop_device *lo)
return loop_switch(lo, NULL);
}
+static void loop_reread_partitions(struct loop_device *lo,
+ struct block_device *bdev)
+{
+ int rc;
+ bool in_release;
+
+ mutex_lock(&lo->lo_open_mutex);
+ in_release = lo->lo_refcnt == 0;
+ mutex_unlock(&lo->lo_open_mutex);
+
+ /* bd_mutex has been held already in release path */
+ if (in_release)
+ rc = __blkdev_reread_part(bdev);
+ else
+ rc = blkdev_reread_part(bdev);
+ if (rc)
+ pr_warn("%s: partition scan of loop%d (%s) failed (rc=%d)\n",
+ __func__, lo->lo_number, lo->lo_file_name, rc);
+}
+
/*
* loop_change_fd switched the backing store of a loopback device to
* a new file. This is useful for operating system installers to free up
@@ -576,7 +596,7 @@ static int loop_change_fd(struct loop_device *lo, struct block_device *bdev,
fput(old_file);
if (lo->lo_flags & LO_FLAGS_PARTSCAN)
- ioctl_by_bdev(bdev, BLKRRPART, 0);
+ loop_reread_partitions(lo, bdev);
return 0;
out_putf:
@@ -807,7 +827,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
if (part_shift)
lo->lo_flags |= LO_FLAGS_PARTSCAN;
if (lo->lo_flags & LO_FLAGS_PARTSCAN)
- ioctl_by_bdev(bdev, BLKRRPART, 0);
+ loop_reread_partitions(lo, bdev);
/* Grab the block_device to prevent its destruction after we
* put /dev/loopXX inode. Later in loop_clr_fd() we bdput(bdev).
@@ -933,7 +953,7 @@ static int loop_clr_fd(struct loop_device *lo)
mutex_unlock(&lo->lo_open_mutex);
if (lo->lo_flags & LO_FLAGS_PARTSCAN && bdev)
- ioctl_by_bdev(bdev, BLKRRPART, 0);
+ loop_reread_partitions(lo, bdev);
lo->lo_flags = 0;
if (!part_shift)
lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN;
@@ -1008,7 +1028,7 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
!(lo->lo_flags & LO_FLAGS_PARTSCAN)) {
lo->lo_flags |= LO_FLAGS_PARTSCAN;
lo->lo_disk->flags &= ~GENHD_FL_NO_PART_SCAN;
- ioctl_by_bdev(lo->lo_device, BLKRRPART, 0);
+ loop_reread_partitions(lo, lo->lo_device);
}
lo->lo_encrypt_key_size = info->lo_encrypt_key_size;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH 4/7] block: nbd: convert to blkdev_reread_part()
2015-04-08 6:23 ` [PATCH 0/7] block: reread partitions improvements Jarod Wilson
` (2 preceding siblings ...)
2015-04-08 6:23 ` [PATCH 3/7] block: loop: fix another reread part failure Jarod Wilson
@ 2015-04-08 6:23 ` Jarod Wilson
2015-04-08 6:23 ` [PATCH 5/7] block: dasd_genhd: convert to blkdev_reread_part Jarod Wilson
` (2 subsequent siblings)
6 siblings, 0 replies; 37+ messages in thread
From: Jarod Wilson @ 2015-04-08 6:23 UTC (permalink / raw)
To: linux-kernel
Cc: Ming Lei, Christoph Hellwig, Jens Axboe, Tejun Heo,
Alexander Viro, Markus Pargmann, Stefan Weinhuber,
Stefan Haberland, Sebastian Ott, Fabian Frederick,
David Herrmann, Mike Galbraith, Andrew Morton, Peter Zijlstra,
nbd-general, linux-s390, Jarod Wilson
From: Ming Lei <ming.lei@canonical.com>
CC: Christoph Hellwig <hch@infradead.org>
CC: Jens Axboe <axboe@kernel.dk>
CC: Tejun Heo <tj@kernel.org>
CC: Alexander Viro <viro@zeniv.linux.org.uk>
CC: Markus Pargmann <mpa@pengutronix.de>
CC: Stefan Weinhuber <wein@de.ibm.com>
CC: Stefan Haberland <stefan.haberland@de.ibm.com>
CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
CC: Fabian Frederick <fabf@skynet.be>
CC: Ming Lei <ming.lei@canonical.com>
CC: David Herrmann <dh.herrmann@gmail.com>
CC: Mike Galbraith <bitbucket@online.de>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Peter Zijlstra <peterz@infradead.org>
CC: nbd-general@lists.sourceforge.net
CC: linux-s390@vger.kernel.org
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
---
drivers/block/nbd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index a98c41f..7e9d26e 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -734,7 +734,7 @@ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd,
bdev->bd_inode->i_size = 0;
set_capacity(nbd->disk, 0);
if (max_part > 0)
- ioctl_by_bdev(bdev, BLKRRPART, 0);
+ blkdev_reread_part(bdev);
if (nbd->disconnect) /* user requested, ignore socket errors */
return 0;
return nbd->harderror;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH 5/7] block: dasd_genhd: convert to blkdev_reread_part
2015-04-08 6:23 ` [PATCH 0/7] block: reread partitions improvements Jarod Wilson
` (3 preceding siblings ...)
2015-04-08 6:23 ` [PATCH 4/7] block: nbd: convert to blkdev_reread_part() Jarod Wilson
@ 2015-04-08 6:23 ` Jarod Wilson
2015-04-08 6:23 ` [PATCH 6/7] block: replace trylock with mutex_lock in blkdev_reread_part() Jarod Wilson
2015-04-08 6:23 ` [PATCH 7/7] s390/block/dasd: remove obsolete while -EBUSY loop Jarod Wilson
6 siblings, 0 replies; 37+ messages in thread
From: Jarod Wilson @ 2015-04-08 6:23 UTC (permalink / raw)
To: linux-kernel
Cc: Ming Lei, Christoph Hellwig, Jens Axboe, Tejun Heo,
Alexander Viro, Markus Pargmann, Stefan Weinhuber,
Stefan Haberland, Sebastian Ott, Fabian Frederick,
David Herrmann, Mike Galbraith, Andrew Morton, Peter Zijlstra,
nbd-general, linux-s390, Jarod Wilson
From: Ming Lei <ming.lei@canonical.com>
Also remove the obsolete comment.
CC: Christoph Hellwig <hch@infradead.org>
CC: Jens Axboe <axboe@kernel.dk>
CC: Tejun Heo <tj@kernel.org>
CC: Alexander Viro <viro@zeniv.linux.org.uk>
CC: Markus Pargmann <mpa@pengutronix.de>
CC: Stefan Weinhuber <wein@de.ibm.com>
CC: Stefan Haberland <stefan.haberland@de.ibm.com>
CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
CC: Fabian Frederick <fabf@skynet.be>
CC: Ming Lei <ming.lei@canonical.com>
CC: David Herrmann <dh.herrmann@gmail.com>
CC: Mike Galbraith <bitbucket@online.de>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Peter Zijlstra <peterz@infradead.org>
CC: nbd-general@lists.sourceforge.net
CC: linux-s390@vger.kernel.org
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
---
drivers/s390/block/dasd_genhd.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/drivers/s390/block/dasd_genhd.c b/drivers/s390/block/dasd_genhd.c
index 90f39f7..8026585 100644
--- a/drivers/s390/block/dasd_genhd.c
+++ b/drivers/s390/block/dasd_genhd.c
@@ -116,14 +116,11 @@ int dasd_scan_partitions(struct dasd_block *block)
rc);
return -ENODEV;
}
- /*
- * See fs/partition/check.c:register_disk,rescan_partitions
- * Can't call rescan_partitions directly. Use ioctl.
- */
- rc = ioctl_by_bdev(bdev, BLKRRPART, 0);
+
+ rc = blkdev_reread_part(bdev);
while (rc == -EBUSY && retry > 0) {
schedule();
- rc = ioctl_by_bdev(bdev, BLKRRPART, 0);
+ rc = blkdev_reread_part(bdev);
retry--;
DBF_DEV_EVENT(DBF_ERR, block->base,
"scan partitions error, retry %d rc %d",
@@ -138,7 +135,7 @@ int dasd_scan_partitions(struct dasd_block *block)
* dasd_generic_set_offline). As long as the partition
* detection is running no offline should be allowed. That
* is why the assignment to device->bdev is done AFTER
- * the BLKRRPART ioctl.
+ * the blkdev_reread_part() call.
*/
block->bdev = bdev;
return 0;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH 6/7] block: replace trylock with mutex_lock in blkdev_reread_part()
2015-04-08 6:23 ` [PATCH 0/7] block: reread partitions improvements Jarod Wilson
` (4 preceding siblings ...)
2015-04-08 6:23 ` [PATCH 5/7] block: dasd_genhd: convert to blkdev_reread_part Jarod Wilson
@ 2015-04-08 6:23 ` Jarod Wilson
2015-04-08 6:23 ` [PATCH 7/7] s390/block/dasd: remove obsolete while -EBUSY loop Jarod Wilson
6 siblings, 0 replies; 37+ messages in thread
From: Jarod Wilson @ 2015-04-08 6:23 UTC (permalink / raw)
To: linux-kernel
Cc: Ming Lei, Christoph Hellwig, Jens Axboe, Tejun Heo,
Alexander Viro, Markus Pargmann, Stefan Weinhuber,
Stefan Haberland, Sebastian Ott, Fabian Frederick,
David Herrmann, Mike Galbraith, Andrew Morton, Peter Zijlstra,
nbd-general, linux-s390, Jarod Wilson
From: Ming Lei <ming.lei@canonical.com>
The only possible problem of using mutex_lock() instead of trylock
is about deadlock.
If there aren't any locks held before calling blkdev_reread_part(lock),
deadlock can't be caused by this conversion.
If there are locks held before calling blkdev_reread_part(lock),
and if these locks arn't required in open, close handler and I/O
path, deadlock shouldn't be caused too.
Both user space's ioctl(BLKRRPART) and md_setup_drive() from
init/do_mounts_md.c belongs to the 1st case, so the conversion is safe
for the two cases.
For loop, the previous patches in this pathset has fixed the ABBA lock
dependency, so the conversion is OK.
For nbd, tx_lock is held when calling the function:
- both open and release won't hold the lock
- when blkdev_reread_part() is run, I/O thread has been stopped
already, so tx_lock won't be acquired in I/O path at that time.
- so the conversion won't cause deadlock for nbd
For dasd, both dasd_open(), dasd_release() and request function don't
acquire any mutex/semphone, so the conversion should be safe.
[jarod: update for modifications earlier in series.]
CC: Christoph Hellwig <hch@infradead.org>
CC: Jens Axboe <axboe@kernel.dk>
CC: Tejun Heo <tj@kernel.org>
CC: Alexander Viro <viro@zeniv.linux.org.uk>
CC: Markus Pargmann <mpa@pengutronix.de>
CC: Stefan Weinhuber <wein@de.ibm.com>
CC: Stefan Haberland <stefan.haberland@de.ibm.com>
CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
CC: Fabian Frederick <fabf@skynet.be>
CC: Ming Lei <ming.lei@canonical.com>
CC: David Herrmann <dh.herrmann@gmail.com>
CC: Mike Galbraith <bitbucket@online.de>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Peter Zijlstra <peterz@infradead.org>
CC: nbd-general@lists.sourceforge.net
CC: linux-s390@vger.kernel.org
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
---
block/ioctl.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/block/ioctl.c b/block/ioctl.c
index 64a4fcb..47c8e6d 100644
--- a/block/ioctl.c
+++ b/block/ioctl.c
@@ -170,15 +170,19 @@ EXPORT_SYMBOL(__blkdev_reread_part);
/*
* This is an exported API for the block driver, and will
- * acquire bd_mutex. Make sure you aren't calling it with
- * bd_mutex already held, or we'll return -EBUSY.
+ * acquire bd_mutex.
+ *
+ * Make sure held locks aren't required in open()/close()
+ * handlers and I/O paths to avoid an ABBA deadlock:
+ * - bd_mutex is held before calling a block driver's open/close
+ * handler
+ * - reading a partition table may submit I/O to the block device
*/
int blkdev_reread_part(struct block_device *bdev)
{
int res;
- if (!mutex_trylock(&bdev->bd_mutex))
- return -EBUSY;
+ mutex_lock(&bdev->bd_mutex);
res = __blkdev_reread_part(bdev);
mutex_unlock(&bdev->bd_mutex);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH 7/7] s390/block/dasd: remove obsolete while -EBUSY loop
2015-04-08 6:23 ` [PATCH 0/7] block: reread partitions improvements Jarod Wilson
` (5 preceding siblings ...)
2015-04-08 6:23 ` [PATCH 6/7] block: replace trylock with mutex_lock in blkdev_reread_part() Jarod Wilson
@ 2015-04-08 6:23 ` Jarod Wilson
6 siblings, 0 replies; 37+ messages in thread
From: Jarod Wilson @ 2015-04-08 6:23 UTC (permalink / raw)
To: linux-kernel
Cc: Jarod Wilson, Christoph Hellwig, Jens Axboe, Tejun Heo,
Alexander Viro, Markus Pargmann, Stefan Weinhuber,
Stefan Haberland, Sebastian Ott, Fabian Frederick, Ming Lei,
David Herrmann, Mike Galbraith, Andrew Morton, Peter Zijlstra,
nbd-general, linux-s390
With the mutex_trylock bit gone from blkdev_reread_part(), the retry logic
in dasd_scan_partitions() shouldn't be necessary.
CC: Christoph Hellwig <hch@infradead.org>
CC: Jens Axboe <axboe@kernel.dk>
CC: Tejun Heo <tj@kernel.org>
CC: Alexander Viro <viro@zeniv.linux.org.uk>
CC: Markus Pargmann <mpa@pengutronix.de>
CC: Stefan Weinhuber <wein@de.ibm.com>
CC: Stefan Haberland <stefan.haberland@de.ibm.com>
CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
CC: Fabian Frederick <fabf@skynet.be>
CC: Ming Lei <ming.lei@canonical.com>
CC: David Herrmann <dh.herrmann@gmail.com>
CC: Mike Galbraith <bitbucket@online.de>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Peter Zijlstra <peterz@infradead.org>
CC: nbd-general@lists.sourceforge.net
CC: linux-s390@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
---
drivers/s390/block/dasd_genhd.c | 13 +++----------
1 file changed, 3 insertions(+), 10 deletions(-)
diff --git a/drivers/s390/block/dasd_genhd.c b/drivers/s390/block/dasd_genhd.c
index 8026585..783d826 100644
--- a/drivers/s390/block/dasd_genhd.c
+++ b/drivers/s390/block/dasd_genhd.c
@@ -99,9 +99,8 @@ void dasd_gendisk_free(struct dasd_block *block)
int dasd_scan_partitions(struct dasd_block *block)
{
struct block_device *bdev;
- int retry, rc;
+ int rc;
- retry = 5;
bdev = bdget_disk(block->gdp, 0);
if (!bdev) {
DBF_DEV_EVENT(DBF_ERR, block->base, "%s",
@@ -118,14 +117,8 @@ int dasd_scan_partitions(struct dasd_block *block)
}
rc = blkdev_reread_part(bdev);
- while (rc == -EBUSY && retry > 0) {
- schedule();
- rc = blkdev_reread_part(bdev);
- retry--;
- DBF_DEV_EVENT(DBF_ERR, block->base,
- "scan partitions error, retry %d rc %d",
- retry, rc);
- }
+ DBF_DEV_EVENT(DBF_ERR, block->base,
+ "scan partitions error, rc %d", rc);
/*
* Since the matching blkdev_put call to the blkdev_get in
--
1.8.3.1
^ permalink raw reply related [flat|nested] 37+ messages in thread