All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/1] scale loop device lock
@ 2021-01-25 20:11 Pavel Tatashin
  2021-01-25 20:11 ` [PATCH v3 1/1] loop: scale loop device by introducing per " Pavel Tatashin
  0 siblings, 1 reply; 6+ messages in thread
From: Pavel Tatashin @ 2021-01-25 20:11 UTC (permalink / raw)
  To: pasha.tatashin, tyhicks, axboe, linux-block, linux-kernel,
	sashal, jmorris, lukas.bulwahn, hch, pvorel, ming.lei, mzxreary,
	mcgrof, zhengbin13, maco, colin.king, evgreen

Changelog
v3
  - Added review-by Tyler
  - Sync with mainline
v2
  - Addressed Tyler Hicks comments
  - added mutex_destroy()
  - comment in lo_open()
  - added lock around lo_disk
===

In our environment we are using systemd portable containers in
squashfs formats, convert them into loop device, and mount.

NAME                      MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
loop5                       7:5    0  76.4M  0 loop
`-BaseImageM1908          252:3    0  76.4M  1 crypt /BaseImageM1908
loop6                       7:6    0    20K  0 loop
`-test_launchperf20       252:17   0   1.3M  1 crypt /app/test_launchperf20
loop7                       7:7    0    20K  0 loop
`-test_launchperf18       252:4    0   1.5M  1 crypt /app/test_launchperf18
loop8                       7:8    0     8K  0 loop
`-test_launchperf8        252:25   0    28K  1 crypt app/test_launchperf8
loop9                       7:9    0   376K  0 loop
`-test_launchperf14       252:29   0  45.7M  1 crypt /app/test_launchperf14
loop10                      7:10   0    16K  0 loop
`-test_launchperf4        252:11   0   968K  1 crypt app/test_launchperf4
loop11                      7:11   0   1.2M  0 loop
`-test_launchperf17       252:26   0 150.4M  1 crypt /app/test_launchperf17
loop12                      7:12   0    36K  0 loop
`-test_launchperf19       252:13   0   3.3M  1 crypt /app/test_launchperf19
loop13                      7:13   0     8K  0 loop
...

We have over 50 loop devices which are mounted  during boot.

We observed contentions around loop_ctl_mutex.

The sample contentions stacks:

Contention 1:
__blkdev_get()
   bdev->bd_disk->fops->open()
      lo_open()
         mutex_lock_killable(&loop_ctl_mutex); <- contention

Contention 2:
__blkdev_put()
   disk->fops->release()
      lo_release()
         mutex_lock(&loop_ctl_mutex); <- contention

With total time waiting for loop_ctl_mutex ~18.8s during boot (across 8
CPUs) on our machine (69 loop devices): 2.35s per CPU.

Scaling this lock eliminates this contention entirely, and improves the boot
performance by 2s on our machine.

v2 https://lore.kernel.org/lkml/20200723211748.13139-1-pasha.tatashin@soleen.com
v1 https://lore.kernel.org/lkml/20200717205322.127694-1-pasha.tatashin@soleen.com

Pavel Tatashin (1):
  loop: scale loop device by introducing per device lock

 drivers/block/loop.c | 92 +++++++++++++++++++++++++-------------------
 drivers/block/loop.h |  1 +
 2 files changed, 54 insertions(+), 39 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v3 1/1] loop: scale loop device by introducing per device lock
  2021-01-25 20:11 [PATCH v3 0/1] scale loop device lock Pavel Tatashin
@ 2021-01-25 20:11 ` Pavel Tatashin
  2021-01-26  9:24   ` Petr Vorel
  2021-01-26  9:53   ` Chaitanya Kulkarni
  0 siblings, 2 replies; 6+ messages in thread
From: Pavel Tatashin @ 2021-01-25 20:11 UTC (permalink / raw)
  To: pasha.tatashin, tyhicks, axboe, linux-block, linux-kernel,
	sashal, jmorris, lukas.bulwahn, hch, pvorel, ming.lei, mzxreary,
	mcgrof, zhengbin13, maco, colin.king, evgreen

Currently, loop device has only one global lock:
loop_ctl_mutex.

This becomes hot in scenarios where many loop devices are used.

Scale it by introducing per-device lock: lo_mutex that protects the
fields in struct loop_device. Keep loop_ctl_mutex to protect global
data such as loop_index_idr, loop_lookup, loop_add.

Lock ordering: loop_ctl_mutex > lo_mutex.

Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
---
 drivers/block/loop.c | 92 +++++++++++++++++++++++++-------------------
 drivers/block/loop.h |  1 +
 2 files changed, 54 insertions(+), 39 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index e5ff328f0917..e520aaf94f93 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -704,7 +704,7 @@ static int loop_change_fd(struct loop_device *lo, struct block_device *bdev,
 	int		error;
 	bool		partscan;
 
-	error = mutex_lock_killable(&loop_ctl_mutex);
+	error = mutex_lock_killable(&lo->lo_mutex);
 	if (error)
 		return error;
 	error = -ENXIO;
@@ -743,9 +743,9 @@ static int loop_change_fd(struct loop_device *lo, struct block_device *bdev,
 	loop_update_dio(lo);
 	blk_mq_unfreeze_queue(lo->lo_queue);
 	partscan = lo->lo_flags & LO_FLAGS_PARTSCAN;
-	mutex_unlock(&loop_ctl_mutex);
+	mutex_unlock(&lo->lo_mutex);
 	/*
-	 * We must drop file reference outside of loop_ctl_mutex as dropping
+	 * We must drop file reference outside of lo_mutex as dropping
 	 * the file ref can take bd_mutex which creates circular locking
 	 * dependency.
 	 */
@@ -755,7 +755,7 @@ static int loop_change_fd(struct loop_device *lo, struct block_device *bdev,
 	return 0;
 
 out_err:
-	mutex_unlock(&loop_ctl_mutex);
+	mutex_unlock(&lo->lo_mutex);
 	if (file)
 		fput(file);
 	return error;
@@ -1092,7 +1092,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
 			goto out_putf;
 	}
 
-	error = mutex_lock_killable(&loop_ctl_mutex);
+	error = mutex_lock_killable(&lo->lo_mutex);
 	if (error)
 		goto out_bdev;
 
@@ -1171,7 +1171,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
 	 * put /dev/loopXX inode. Later in __loop_clr_fd() we bdput(bdev).
 	 */
 	bdgrab(bdev);
-	mutex_unlock(&loop_ctl_mutex);
+	mutex_unlock(&lo->lo_mutex);
 	if (partscan)
 		loop_reread_partitions(lo, bdev);
 	if (!(mode & FMODE_EXCL))
@@ -1179,7 +1179,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
 	return 0;
 
 out_unlock:
-	mutex_unlock(&loop_ctl_mutex);
+	mutex_unlock(&lo->lo_mutex);
 out_bdev:
 	if (!(mode & FMODE_EXCL))
 		bd_abort_claiming(bdev, loop_configure);
@@ -1200,7 +1200,7 @@ static int __loop_clr_fd(struct loop_device *lo, bool release)
 	bool partscan = false;
 	int lo_number;
 
-	mutex_lock(&loop_ctl_mutex);
+	mutex_lock(&lo->lo_mutex);
 	if (WARN_ON_ONCE(lo->lo_state != Lo_rundown)) {
 		err = -ENXIO;
 		goto out_unlock;
@@ -1253,7 +1253,7 @@ static int __loop_clr_fd(struct loop_device *lo, bool release)
 	lo_number = lo->lo_number;
 	loop_unprepare_queue(lo);
 out_unlock:
-	mutex_unlock(&loop_ctl_mutex);
+	mutex_unlock(&lo->lo_mutex);
 	if (partscan) {
 		/*
 		 * bd_mutex has been held already in release path, so don't
@@ -1284,18 +1284,18 @@ static int __loop_clr_fd(struct loop_device *lo, bool release)
 	 * protects us from all the other places trying to change the 'lo'
 	 * device.
 	 */
-	mutex_lock(&loop_ctl_mutex);
+	mutex_lock(&lo->lo_mutex);
 	lo->lo_flags = 0;
 	if (!part_shift)
 		lo->lo_disk->flags |= GENHD_FL_NO_PART_SCAN;
 	lo->lo_state = Lo_unbound;
-	mutex_unlock(&loop_ctl_mutex);
+	mutex_unlock(&lo->lo_mutex);
 
 	/*
-	 * Need not hold loop_ctl_mutex to fput backing file.
-	 * Calling fput holding loop_ctl_mutex triggers a circular
+	 * Need not hold lo_mutex to fput backing file.
+	 * Calling fput holding lo_mutex triggers a circular
 	 * lock dependency possibility warning as fput can take
-	 * bd_mutex which is usually taken before loop_ctl_mutex.
+	 * bd_mutex which is usually taken before lo_mutex.
 	 */
 	if (filp)
 		fput(filp);
@@ -1306,11 +1306,11 @@ static int loop_clr_fd(struct loop_device *lo)
 {
 	int err;
 
-	err = mutex_lock_killable(&loop_ctl_mutex);
+	err = mutex_lock_killable(&lo->lo_mutex);
 	if (err)
 		return err;
 	if (lo->lo_state != Lo_bound) {
-		mutex_unlock(&loop_ctl_mutex);
+		mutex_unlock(&lo->lo_mutex);
 		return -ENXIO;
 	}
 	/*
@@ -1325,11 +1325,11 @@ static int loop_clr_fd(struct loop_device *lo)
 	 */
 	if (atomic_read(&lo->lo_refcnt) > 1) {
 		lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
-		mutex_unlock(&loop_ctl_mutex);
+		mutex_unlock(&lo->lo_mutex);
 		return 0;
 	}
 	lo->lo_state = Lo_rundown;
-	mutex_unlock(&loop_ctl_mutex);
+	mutex_unlock(&lo->lo_mutex);
 
 	return __loop_clr_fd(lo, false);
 }
@@ -1344,7 +1344,7 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
 	bool partscan = false;
 	bool size_changed = false;
 
-	err = mutex_lock_killable(&loop_ctl_mutex);
+	err = mutex_lock_killable(&lo->lo_mutex);
 	if (err)
 		return err;
 	if (lo->lo_encrypt_key_size &&
@@ -1411,7 +1411,7 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
 		partscan = true;
 	}
 out_unlock:
-	mutex_unlock(&loop_ctl_mutex);
+	mutex_unlock(&lo->lo_mutex);
 	if (partscan)
 		loop_reread_partitions(lo, bdev);
 
@@ -1425,11 +1425,11 @@ loop_get_status(struct loop_device *lo, struct loop_info64 *info)
 	struct kstat stat;
 	int ret;
 
-	ret = mutex_lock_killable(&loop_ctl_mutex);
+	ret = mutex_lock_killable(&lo->lo_mutex);
 	if (ret)
 		return ret;
 	if (lo->lo_state != Lo_bound) {
-		mutex_unlock(&loop_ctl_mutex);
+		mutex_unlock(&lo->lo_mutex);
 		return -ENXIO;
 	}
 
@@ -1448,10 +1448,10 @@ loop_get_status(struct loop_device *lo, struct loop_info64 *info)
 		       lo->lo_encrypt_key_size);
 	}
 
-	/* Drop loop_ctl_mutex while we call into the filesystem. */
+	/* Drop lo_mutex while we call into the filesystem. */
 	path = lo->lo_backing_file->f_path;
 	path_get(&path);
-	mutex_unlock(&loop_ctl_mutex);
+	mutex_unlock(&lo->lo_mutex);
 	ret = vfs_getattr(&path, &stat, STATX_INO, AT_STATX_SYNC_AS_STAT);
 	if (!ret) {
 		info->lo_device = huge_encode_dev(stat.dev);
@@ -1637,7 +1637,7 @@ static int lo_simple_ioctl(struct loop_device *lo, unsigned int cmd,
 {
 	int err;
 
-	err = mutex_lock_killable(&loop_ctl_mutex);
+	err = mutex_lock_killable(&lo->lo_mutex);
 	if (err)
 		return err;
 	switch (cmd) {
@@ -1653,7 +1653,7 @@ static int lo_simple_ioctl(struct loop_device *lo, unsigned int cmd,
 	default:
 		err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL;
 	}
-	mutex_unlock(&loop_ctl_mutex);
+	mutex_unlock(&lo->lo_mutex);
 	return err;
 }
 
@@ -1879,27 +1879,33 @@ static int lo_open(struct block_device *bdev, fmode_t mode)
 	struct loop_device *lo;
 	int err;
 
+	/*
+	 * take loop_ctl_mutex to protect lo pointer from race with
+	 * loop_control_ioctl(LOOP_CTL_REMOVE), however, to reduce
+	 * contention release it prior to updating lo->lo_refcnt.
+	 */
 	err = mutex_lock_killable(&loop_ctl_mutex);
 	if (err)
 		return err;
 	lo = bdev->bd_disk->private_data;
 	if (!lo) {
-		err = -ENXIO;
-		goto out;
+		mutex_unlock(&loop_ctl_mutex);
+		return -ENXIO;
 	}
-
-	atomic_inc(&lo->lo_refcnt);
-out:
+	err = mutex_lock_killable(&lo->lo_mutex);
 	mutex_unlock(&loop_ctl_mutex);
-	return err;
+	if (err)
+		return err;
+	atomic_inc(&lo->lo_refcnt);
+	mutex_unlock(&lo->lo_mutex);
+	return 0;
 }
 
 static void lo_release(struct gendisk *disk, fmode_t mode)
 {
-	struct loop_device *lo;
+	struct loop_device *lo = disk->private_data;
 
-	mutex_lock(&loop_ctl_mutex);
-	lo = disk->private_data;
+	mutex_lock(&lo->lo_mutex);
 	if (atomic_dec_return(&lo->lo_refcnt))
 		goto out_unlock;
 
@@ -1907,7 +1913,7 @@ static void lo_release(struct gendisk *disk, fmode_t mode)
 		if (lo->lo_state != Lo_bound)
 			goto out_unlock;
 		lo->lo_state = Lo_rundown;
-		mutex_unlock(&loop_ctl_mutex);
+		mutex_unlock(&lo->lo_mutex);
 		/*
 		 * In autoclear mode, stop the loop thread
 		 * and remove configuration after last close.
@@ -1924,7 +1930,7 @@ static void lo_release(struct gendisk *disk, fmode_t mode)
 	}
 
 out_unlock:
-	mutex_unlock(&loop_ctl_mutex);
+	mutex_unlock(&lo->lo_mutex);
 }
 
 static const struct block_device_operations lo_fops = {
@@ -1963,10 +1969,10 @@ static int unregister_transfer_cb(int id, void *ptr, void *data)
 	struct loop_device *lo = ptr;
 	struct loop_func_table *xfer = data;
 
-	mutex_lock(&loop_ctl_mutex);
+	mutex_lock(&lo->lo_mutex);
 	if (lo->lo_encryption == xfer)
 		loop_release_xfer(lo);
-	mutex_unlock(&loop_ctl_mutex);
+	mutex_unlock(&lo->lo_mutex);
 	return 0;
 }
 
@@ -2152,6 +2158,7 @@ static int loop_add(struct loop_device **l, int i)
 		disk->flags |= GENHD_FL_NO_PART_SCAN;
 	disk->flags |= GENHD_FL_EXT_DEVT;
 	atomic_set(&lo->lo_refcnt, 0);
+	mutex_init(&lo->lo_mutex);
 	lo->lo_number		= i;
 	spin_lock_init(&lo->lo_lock);
 	disk->major		= LOOP_MAJOR;
@@ -2182,6 +2189,7 @@ static void loop_remove(struct loop_device *lo)
 	blk_cleanup_queue(lo->lo_queue);
 	blk_mq_free_tag_set(&lo->tag_set);
 	put_disk(lo->lo_disk);
+	mutex_destroy(&lo->lo_mutex);
 	kfree(lo);
 }
 
@@ -2261,15 +2269,21 @@ static long loop_control_ioctl(struct file *file, unsigned int cmd,
 		ret = loop_lookup(&lo, parm);
 		if (ret < 0)
 			break;
+		ret = mutex_lock_killable(&lo->lo_mutex);
+		if (ret)
+			break;
 		if (lo->lo_state != Lo_unbound) {
 			ret = -EBUSY;
+			mutex_unlock(&lo->lo_mutex);
 			break;
 		}
 		if (atomic_read(&lo->lo_refcnt) > 0) {
 			ret = -EBUSY;
+			mutex_unlock(&lo->lo_mutex);
 			break;
 		}
 		lo->lo_disk->private_data = NULL;
+		mutex_unlock(&lo->lo_mutex);
 		idr_remove(&loop_index_idr, lo->lo_number);
 		loop_remove(lo);
 		break;
diff --git a/drivers/block/loop.h b/drivers/block/loop.h
index af75a5ee4094..a3c04f310672 100644
--- a/drivers/block/loop.h
+++ b/drivers/block/loop.h
@@ -62,6 +62,7 @@ struct loop_device {
 	struct request_queue	*lo_queue;
 	struct blk_mq_tag_set	tag_set;
 	struct gendisk		*lo_disk;
+	struct mutex		lo_mutex;
 };
 
 struct loop_cmd {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 1/1] loop: scale loop device by introducing per device lock
  2021-01-25 20:11 ` [PATCH v3 1/1] loop: scale loop device by introducing per " Pavel Tatashin
@ 2021-01-26  9:24   ` Petr Vorel
  2021-01-26 14:22     ` Pavel Tatashin
  2021-01-26  9:53   ` Chaitanya Kulkarni
  1 sibling, 1 reply; 6+ messages in thread
From: Petr Vorel @ 2021-01-26  9:24 UTC (permalink / raw)
  To: Pavel Tatashin
  Cc: tyhicks, axboe, linux-block, linux-kernel, sashal, jmorris,
	lukas.bulwahn, hch, ming.lei, mzxreary, mcgrof, zhengbin13, maco,
	colin.king, evgreen

Hi,

> Currently, loop device has only one global lock:
> loop_ctl_mutex.

> This becomes hot in scenarios where many loop devices are used.

> Scale it by introducing per-device lock: lo_mutex that protects the
> fields in struct loop_device. Keep loop_ctl_mutex to protect global
> data such as loop_index_idr, loop_lookup, loop_add.

> Lock ordering: loop_ctl_mutex > lo_mutex.

Reviewed-by: Petr Vorel <pvorel@suse.cz>

Kind regards,
Petr

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 1/1] loop: scale loop device by introducing per device lock
  2021-01-25 20:11 ` [PATCH v3 1/1] loop: scale loop device by introducing per " Pavel Tatashin
  2021-01-26  9:24   ` Petr Vorel
@ 2021-01-26  9:53   ` Chaitanya Kulkarni
  2021-01-26 14:29     ` Pavel Tatashin
  1 sibling, 1 reply; 6+ messages in thread
From: Chaitanya Kulkarni @ 2021-01-26  9:53 UTC (permalink / raw)
  To: Pavel Tatashin, tyhicks, axboe, linux-block, linux-kernel,
	sashal, jmorris, lukas.bulwahn, hch, pvorel, ming.lei, mzxreary,
	mcgrof, zhengbin13, maco, colin.king, evgreen

On 1/25/21 12:15 PM, Pavel Tatashin wrote:
> Currently, loop device has only one global lock:
> loop_ctl_mutex.
Above line can be :-
Currently, loop device has only one global lock: loop_ctl_mutex.

Also please provide a complete discretion what are the members it protects,
i.e. how big the size of the current locking is, helps the reviewers &
maintainer.
> This becomes hot in scenarios where many loop devices are used.
>
> Scale it by introducing per-device lock: lo_mutex that protects the
> fields in struct loop_device. Keep loop_ctl_mutex to protect global
> data such as loop_index_idr, loop_lookup, loop_add.
When it comes to scaling, lockstat data is more descriptive and useful along
with thetotal time of execution which has contention numbers with increasing
number of threads/devices/users on logarithmic scale, at-least that is
how I've
solved the some of file-systems scaling issues in the past.
>
> Lock ordering: loop_ctl_mutex > lo_mutex.
The above statement needs a in-detail commit log description. Usually >
sort of statements are not a good practice for something as important as
lock priority which was not present in the original code.
> Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
> Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
> ---
>  drivers/block/loop.c | 92 +++++++++++++++++++++++++-------------------
>
>
>  
>  	/*
> -	 * Need not hold loop_ctl_mutex to fput backing file.
> -	 * Calling fput holding loop_ctl_mutex triggers a circular
> +	 * Need not hold lo_mutex to fput backing file.
> +	 * Calling fput holding lo_mutex triggers a circular
>  	 * lock dependency possibility warning as fput can take
> -	 * bd_mutex which is usually taken before loop_ctl_mutex.
> +	 * bd_mutex which is usually taken before lo_mutex.
>  	 */
This is not in your patch, but since you are touching this comment can you
please consider this, it save an entire line and the wasted space:-
       /*  
        * Need not hold lo_mutex to fput backing file. Calling fput holding
        * lo_mutex triggers a circular lock dependency possibility
warning as
        * fput can take bd_mutex which is usually take before lo_mutex.
        */

> @@ -1879,27 +1879,33 @@ static int lo_open(struct block_device *bdev, fmode_t mode)
>  	struct loop_device *lo;
>  	int err;
>  
> +	/*
> +	 * take loop_ctl_mutex to protect lo pointer from race with
> +	 * loop_control_ioctl(LOOP_CTL_REMOVE), however, to reduce
> +	 * contention release it prior to updating lo->lo_refcnt.
> +	 */

The above comment could be :-

        /*  
         * Take loop_ctl_mutex to protect lo pointer from race with
         * loop_control_ioctl(LOOP_CTL_REMOVE), however, to reduce
contention
         * release it prior to updating lo->lo_refcnt.
         */
>  	err = mutex_lock_killable(&loop_ctl_mutex);
>  	if (err)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 1/1] loop: scale loop device by introducing per device lock
  2021-01-26  9:24   ` Petr Vorel
@ 2021-01-26 14:22     ` Pavel Tatashin
  0 siblings, 0 replies; 6+ messages in thread
From: Pavel Tatashin @ 2021-01-26 14:22 UTC (permalink / raw)
  To: Petr Vorel
  Cc: Tyler Hicks, Jens Axboe, linux-block, LKML, Sasha Levin,
	James Morris, lukas.bulwahn, hch, ming.lei, mzxreary, mcgrof,
	zhengbin13, maco, Colin King, evgreen

On Tue, Jan 26, 2021 at 4:24 AM Petr Vorel <pvorel@suse.cz> wrote:
>
> Hi,
>
> > Currently, loop device has only one global lock:
> > loop_ctl_mutex.
>
> > This becomes hot in scenarios where many loop devices are used.
>
> > Scale it by introducing per-device lock: lo_mutex that protects the
> > fields in struct loop_device. Keep loop_ctl_mutex to protect global
> > data such as loop_index_idr, loop_lookup, loop_add.
>
> > Lock ordering: loop_ctl_mutex > lo_mutex.
>
> Reviewed-by: Petr Vorel <pvorel@suse.cz>

Thank you for reviewing this patch.

Pasha

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3 1/1] loop: scale loop device by introducing per device lock
  2021-01-26  9:53   ` Chaitanya Kulkarni
@ 2021-01-26 14:29     ` Pavel Tatashin
  0 siblings, 0 replies; 6+ messages in thread
From: Pavel Tatashin @ 2021-01-26 14:29 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: tyhicks, axboe, linux-block, linux-kernel, sashal, jmorris,
	lukas.bulwahn, hch, pvorel, ming.lei, mzxreary, mcgrof,
	zhengbin13, maco, colin.king, evgreen

On Tue, Jan 26, 2021 at 4:53 AM Chaitanya Kulkarni
<Chaitanya.Kulkarni@wdc.com> wrote:
>
> On 1/25/21 12:15 PM, Pavel Tatashin wrote:
> > Currently, loop device has only one global lock:
> > loop_ctl_mutex.
> Above line can be :-
> Currently, loop device has only one global lock: loop_ctl_mutex.

OK

>
> Also please provide a complete discretion what are the members it protects,
> i.e. how big the size of the current locking is, helps the reviewers &
> maintainer.

Sure

> > This becomes hot in scenarios where many loop devices are used.
> >
> > Scale it by introducing per-device lock: lo_mutex that protects the
> > fields in struct loop_device. Keep loop_ctl_mutex to protect global
> > data such as loop_index_idr, loop_lookup, loop_add.
> When it comes to scaling, lockstat data is more descriptive and useful along
> with thetotal time of execution which has contention numbers with increasing
> number of threads/devices/users on logarithmic scale, at-least that is
> how I've
> solved the some of file-systems scaling issues in the past.

I found this issue using perf that shows profiling. I've previously
used lockstat, it is indeed a good tool to work with lock contentions.

> >
> > Lock ordering: loop_ctl_mutex > lo_mutex.
> The above statement needs a in-detail commit log description. Usually >
> sort of statements are not a good practice for something as important as
> lock priority which was not present in the original code.

OK, I will expand this to clearly state that new lock ordering
requirement is that loop_ctl_mutex must be taken before lo_mutex.

> > Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>
> > Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
> > ---
> >  drivers/block/loop.c | 92 +++++++++++++++++++++++++-------------------
> >
> >
> >
> >       /*
> > -      * Need not hold loop_ctl_mutex to fput backing file.
> > -      * Calling fput holding loop_ctl_mutex triggers a circular
> > +      * Need not hold lo_mutex to fput backing file.
> > +      * Calling fput holding lo_mutex triggers a circular
> >        * lock dependency possibility warning as fput can take
> > -      * bd_mutex which is usually taken before loop_ctl_mutex.
> > +      * bd_mutex which is usually taken before lo_mutex.
> >        */
> This is not in your patch, but since you are touching this comment can you
> please consider this, it save an entire line and the wasted space:-

OK

>        /*
>         * Need not hold lo_mutex to fput backing file. Calling fput holding
>         * lo_mutex triggers a circular lock dependency possibility
> warning as
>         * fput can take bd_mutex which is usually take before lo_mutex.
>         */
>
> > @@ -1879,27 +1879,33 @@ static int lo_open(struct block_device *bdev, fmode_t mode)
> >       struct loop_device *lo;
> >       int err;
> >
> > +     /*
> > +      * take loop_ctl_mutex to protect lo pointer from race with
> > +      * loop_control_ioctl(LOOP_CTL_REMOVE), however, to reduce
> > +      * contention release it prior to updating lo->lo_refcnt.
> > +      */
>
> The above comment could be :-
>
>         /*
>          * Take loop_ctl_mutex to protect lo pointer from race with
>          * loop_control_ioctl(LOOP_CTL_REMOVE), however, to reduce
> contention
>          * release it prior to updating lo->lo_refcnt.
>          */

OK

> >       err = mutex_lock_killable(&loop_ctl_mutex);
> >       if (err)

I will send an updated patch soon.

Thank you,
Pasha

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-01-26 17:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-25 20:11 [PATCH v3 0/1] scale loop device lock Pavel Tatashin
2021-01-25 20:11 ` [PATCH v3 1/1] loop: scale loop device by introducing per " Pavel Tatashin
2021-01-26  9:24   ` Petr Vorel
2021-01-26 14:22     ` Pavel Tatashin
2021-01-26  9:53   ` Chaitanya Kulkarni
2021-01-26 14:29     ` Pavel Tatashin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.