linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Add a new LOOP_SET_FD_AND_STATUS ioctl.
@ 2020-04-20  8:04 Martijn Coenen
  2020-04-20  8:04 ` [PATCH 1/4] loop: Refactor size calculation Martijn Coenen
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Martijn Coenen @ 2020-04-20  8:04 UTC (permalink / raw)
  To: axboe, hch, ming.lei
  Cc: narayan, zezeozue, kernel-team, linux-block, linux-kernel, maco,
	bvanassche, Chaitanya.Kulkarni, Martijn Coenen

This series introduces a new ioctl that makes it possible to atomically
configure a loop device. Previously, if you wanted to set parameters
such as the offset on a loop device, this required calling LOOP_SET_FD
to set the backing file, and then LOOP_SET_STATUS to set the offset.
However, in between these two calls, the loop device is available and
would accept requests, which is generally not desirable.

There are also performance benefits with combining these two ioctls into
one, which is described in more detail in the last change in the series.

Note that this series depends on
"loop: Call loop_config_discard() only after new config is applied."
[0], which I sent as a separate patch as it fixes an unrelated bug.

[0]: https://lkml.org/lkml/2020/3/31/755

Martijn Coenen (4):
  loop: Refactor size calculation.
  loop: Factor out configuring loop from status.
  loop: Move loop_set_from_status() and friends up.
  loop: Add LOOP_SET_FD_AND_STATUS ioctl.

 drivers/block/loop.c      | 309 ++++++++++++++++++++++----------------
 include/uapi/linux/loop.h |   6 +
 2 files changed, 189 insertions(+), 126 deletions(-)

-- 
2.26.1.301.g55bc3eb7cb9-goog


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/4] loop: Refactor size calculation.
  2020-04-20  8:04 [PATCH 0/4] Add a new LOOP_SET_FD_AND_STATUS ioctl Martijn Coenen
@ 2020-04-20  8:04 ` Martijn Coenen
  2020-04-20 13:22   ` Bart Van Assche
  2020-04-20  8:04 ` [PATCH 2/4] loop: Factor out configuring loop from status Martijn Coenen
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Martijn Coenen @ 2020-04-20  8:04 UTC (permalink / raw)
  To: axboe, hch, ming.lei
  Cc: narayan, zezeozue, kernel-team, linux-block, linux-kernel, maco,
	bvanassche, Chaitanya.Kulkarni, Martijn Coenen

figure_loop_size() calculates the loop size based on the passed in
parameters, but at the same time it updates the offset and sizelimit
parameters in the loop device configuration. That is a somewhat
unexpected side effect of a function with this name, and it is only only
needed by one of the two callers of this function - loop_set_status().

Move the lo_offset and lo_sizelimit assignment back into loop_set_status(),
and factor out a new function (loop_set_size()) to apply a newly
calculated size that has been validated to fit in sector_t, and notify
user-space.

loop_update_size() is now solely used by loop_set_capacity to compute an
updated size, and loop_set_status() uses a combination of functions to
validate the loop size, and only apply it later.

Signed-off-by: Martijn Coenen <maco@android.com>
---
 drivers/block/loop.c | 88 ++++++++++++++++++++++++++++----------------
 1 file changed, 56 insertions(+), 32 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index f1754262fc94..d934d65dbe92 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -228,23 +228,42 @@ static void __loop_update_dio(struct loop_device *lo, bool dio)
 		blk_mq_unfreeze_queue(lo->lo_queue);
 }
 
+/**
+ * loop_set_size - sets device size and notifies userspace
+ * @lo: struct loop_device to set the size for
+ * @size: new size of the loop device
+ *
+ * Callers must validate that the size passed into this function fits into
+ * a sector_t.
+ */
+static void loop_set_size(struct loop_device *lo, loff_t size)
+{
+	struct block_device *bdev = lo->lo_device;
+
+	set_capacity(lo->lo_disk, size);
+	bd_set_size(bdev, size << 9);
+	/* let user-space know about the new size */
+	kobject_uevent(&disk_to_dev(bdev->bd_disk)->kobj, KOBJ_CHANGE);
+}
+
+/**
+ * loop_update_size - updates device size from the backing file
+ * @lo: struct loop_device to update the size for
+ *
+ * Recomputes the device size from the backing file, and updates the device
+ * with the new size.
+ */
 static int
-figure_loop_size(struct loop_device *lo, loff_t offset, loff_t sizelimit)
+loop_update_size(struct loop_device *lo)
 {
-	loff_t size = get_size(offset, sizelimit, lo->lo_backing_file);
+	loff_t size = get_loop_size(lo, lo->lo_backing_file);
 	sector_t x = (sector_t)size;
-	struct block_device *bdev = lo->lo_device;
 
 	if (unlikely((loff_t)x != size))
 		return -EFBIG;
-	if (lo->lo_offset != offset)
-		lo->lo_offset = offset;
-	if (lo->lo_sizelimit != sizelimit)
-		lo->lo_sizelimit = sizelimit;
-	set_capacity(lo->lo_disk, x);
-	bd_set_size(bdev, (loff_t)get_capacity(bdev->bd_disk) << 9);
-	/* let user-space know about the new size */
-	kobject_uevent(&disk_to_dev(bdev->bd_disk)->kobj, KOBJ_CHANGE);
+
+	loop_set_size(lo, size);
+
 	return 0;
 }
 
@@ -1040,11 +1059,8 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
 
 	loop_update_rotational(lo);
 	loop_update_dio(lo);
-	set_capacity(lo->lo_disk, size);
-	bd_set_size(bdev, size << 9);
 	loop_sysfs_init(lo);
-	/* let user-space know about the new size */
-	kobject_uevent(&disk_to_dev(bdev->bd_disk)->kobj, KOBJ_CHANGE);
+	loop_set_size(lo, size);
 
 	set_blocksize(bdev, S_ISBLK(inode->i_mode) ?
 		      block_size(inode->i_bdev) : PAGE_SIZE);
@@ -1267,6 +1283,8 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
 	kuid_t uid = current_uid();
 	struct block_device *bdev;
 	bool partscan = false;
+	bool size_changed = false;
+	loff_t validated_size;
 
 	err = mutex_lock_killable(&loop_ctl_mutex);
 	if (err)
@@ -1288,6 +1306,14 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
 
 	if (lo->lo_offset != info->lo_offset ||
 	    lo->lo_sizelimit != info->lo_sizelimit) {
+		size_t size = get_size(info->lo_offset, info->lo_sizelimit,
+				       lo->lo_backing_file);
+		if ((loff_t)(sector_t)size != size) {
+			err = -EFBIG;
+			goto out_unlock;
+		}
+		size_changed = true;
+		validated_size = size;
 		sync_blockdev(lo->lo_device);
 		kill_bdev(lo->lo_device);
 	}
@@ -1295,6 +1321,15 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
 	/* I/O need to be drained during transfer transition */
 	blk_mq_freeze_queue(lo->lo_queue);
 
+	if (size_changed && lo->lo_device->bd_inode->i_mapping->nrpages) {
+		/* kill_bdev should have truncated all the pages */
+		err = -EAGAIN;
+		pr_warn("%s: loop%d (%s) has still dirty pages (nrpages=%lu)\n",
+			__func__, lo->lo_number, lo->lo_file_name,
+			lo->lo_device->bd_inode->i_mapping->nrpages);
+		goto out_unfreeze;
+	}
+
 	err = loop_release_xfer(lo);
 	if (err)
 		goto out_unfreeze;
@@ -1318,22 +1353,8 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
 	if (err)
 		goto out_unfreeze;
 
-	if (lo->lo_offset != info->lo_offset ||
-	    lo->lo_sizelimit != info->lo_sizelimit) {
-		/* kill_bdev should have truncated all the pages */
-		if (lo->lo_device->bd_inode->i_mapping->nrpages) {
-			err = -EAGAIN;
-			pr_warn("%s: loop%d (%s) has still dirty pages (nrpages=%lu)\n",
-				__func__, lo->lo_number, lo->lo_file_name,
-				lo->lo_device->bd_inode->i_mapping->nrpages);
-			goto out_unfreeze;
-		}
-		if (figure_loop_size(lo, info->lo_offset, info->lo_sizelimit)) {
-			err = -EFBIG;
-			goto out_unfreeze;
-		}
-	}
-
+	lo->lo_offset = info->lo_offset;
+	lo->lo_sizelimit = info->lo_sizelimit;
 	memcpy(lo->lo_file_name, info->lo_file_name, LO_NAME_SIZE);
 	memcpy(lo->lo_crypt_name, info->lo_crypt_name, LO_NAME_SIZE);
 	lo->lo_file_name[LO_NAME_SIZE-1] = 0;
@@ -1357,6 +1378,9 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
 		lo->lo_key_owner = uid;
 	}
 
+	if (size_changed)
+		loop_set_size(lo, validated_size);
+
 	loop_config_discard(lo);
 
 	/* update dio if lo_offset or transfer is changed */
@@ -1534,7 +1558,7 @@ static int loop_set_capacity(struct loop_device *lo)
 	if (unlikely(lo->lo_state != Lo_bound))
 		return -ENXIO;
 
-	return figure_loop_size(lo, lo->lo_offset, lo->lo_sizelimit);
+	return loop_update_size(lo);
 }
 
 static int loop_set_dio(struct loop_device *lo, unsigned long arg)
-- 
2.26.1.301.g55bc3eb7cb9-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/4] loop: Factor out configuring loop from status.
  2020-04-20  8:04 [PATCH 0/4] Add a new LOOP_SET_FD_AND_STATUS ioctl Martijn Coenen
  2020-04-20  8:04 ` [PATCH 1/4] loop: Refactor size calculation Martijn Coenen
@ 2020-04-20  8:04 ` Martijn Coenen
  2020-04-20 13:34   ` Bart Van Assche
  2020-04-20 13:49   ` Bart Van Assche
  2020-04-20  8:04 ` [PATCH 3/4] loop: Move loop_set_from_status() and friends up Martijn Coenen
  2020-04-20  8:04 ` [PATCH 4/4] loop: Add LOOP_SET_FD_AND_STATUS ioctl Martijn Coenen
  3 siblings, 2 replies; 17+ messages in thread
From: Martijn Coenen @ 2020-04-20  8:04 UTC (permalink / raw)
  To: axboe, hch, ming.lei
  Cc: narayan, zezeozue, kernel-team, linux-block, linux-kernel, maco,
	bvanassche, Chaitanya.Kulkarni, Martijn Coenen

Factor out this code into a separate function, so it can be reused by
other code more easily.

Signed-off-by: Martijn Coenen <maco@android.com>
---
 drivers/block/loop.c | 108 +++++++++++++++++++++++--------------------
 1 file changed, 58 insertions(+), 50 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index d934d65dbe92..e0f9674fe568 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1276,12 +1276,68 @@ static int loop_clr_fd(struct loop_device *lo)
 }
 
 static int
-loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
+loop_set_from_status(struct loop_device *lo, const struct loop_info64 *info)
 {
 	int err;
 	struct loop_func_table *xfer;
 	kuid_t uid = current_uid();
+
+	if ((unsigned int) info->lo_encrypt_key_size > LO_KEY_SIZE)
+		return -EINVAL;
+
+	err = loop_release_xfer(lo);
+	if (err)
+		return err;
+
+	if (info->lo_encrypt_type) {
+		unsigned int type = info->lo_encrypt_type;
+
+		if (type >= MAX_LO_CRYPT)
+			return -EINVAL;
+		xfer = xfer_funcs[type];
+		if (xfer == NULL)
+			return -EINVAL;
+	} else
+		xfer = NULL;
+
+	err = loop_init_xfer(lo, xfer, info);
+	if (err)
+		return err;
+
+	lo->lo_offset = info->lo_offset;
+	lo->lo_sizelimit = info->lo_sizelimit;
+	memcpy(lo->lo_file_name, info->lo_file_name, LO_NAME_SIZE);
+	memcpy(lo->lo_crypt_name, info->lo_crypt_name, LO_NAME_SIZE);
+	lo->lo_file_name[LO_NAME_SIZE-1] = 0;
+	lo->lo_crypt_name[LO_NAME_SIZE-1] = 0;
+
+	if (!xfer)
+		xfer = &none_funcs;
+	lo->transfer = xfer->transfer;
+	lo->ioctl = xfer->ioctl;
+
+	if ((lo->lo_flags & LO_FLAGS_AUTOCLEAR) !=
+	     (info->lo_flags & LO_FLAGS_AUTOCLEAR))
+		lo->lo_flags ^= LO_FLAGS_AUTOCLEAR;
+
+	lo->lo_encrypt_key_size = info->lo_encrypt_key_size;
+	lo->lo_init[0] = info->lo_init[0];
+	lo->lo_init[1] = info->lo_init[1];
+	if (info->lo_encrypt_key_size) {
+		memcpy(lo->lo_encrypt_key, info->lo_encrypt_key,
+		       info->lo_encrypt_key_size);
+		lo->lo_key_owner = uid;
+	}
+
+	return 0;
+}
+
+static int
+loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
+{
+	int err;
 	struct block_device *bdev;
+	kuid_t uid = current_uid();
 	bool partscan = false;
 	bool size_changed = false;
 	loff_t validated_size;
@@ -1299,10 +1355,6 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
 		err = -ENXIO;
 		goto out_unlock;
 	}
-	if ((unsigned int) info->lo_encrypt_key_size > LO_KEY_SIZE) {
-		err = -EINVAL;
-		goto out_unlock;
-	}
 
 	if (lo->lo_offset != info->lo_offset ||
 	    lo->lo_sizelimit != info->lo_sizelimit) {
@@ -1330,54 +1382,10 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
 		goto out_unfreeze;
 	}
 
-	err = loop_release_xfer(lo);
-	if (err)
-		goto out_unfreeze;
-
-	if (info->lo_encrypt_type) {
-		unsigned int type = info->lo_encrypt_type;
-
-		if (type >= MAX_LO_CRYPT) {
-			err = -EINVAL;
-			goto out_unfreeze;
-		}
-		xfer = xfer_funcs[type];
-		if (xfer == NULL) {
-			err = -EINVAL;
-			goto out_unfreeze;
-		}
-	} else
-		xfer = NULL;
-
-	err = loop_init_xfer(lo, xfer, info);
+	err = loop_set_from_status(lo, info);
 	if (err)
 		goto out_unfreeze;
 
-	lo->lo_offset = info->lo_offset;
-	lo->lo_sizelimit = info->lo_sizelimit;
-	memcpy(lo->lo_file_name, info->lo_file_name, LO_NAME_SIZE);
-	memcpy(lo->lo_crypt_name, info->lo_crypt_name, LO_NAME_SIZE);
-	lo->lo_file_name[LO_NAME_SIZE-1] = 0;
-	lo->lo_crypt_name[LO_NAME_SIZE-1] = 0;
-
-	if (!xfer)
-		xfer = &none_funcs;
-	lo->transfer = xfer->transfer;
-	lo->ioctl = xfer->ioctl;
-
-	if ((lo->lo_flags & LO_FLAGS_AUTOCLEAR) !=
-	     (info->lo_flags & LO_FLAGS_AUTOCLEAR))
-		lo->lo_flags ^= LO_FLAGS_AUTOCLEAR;
-
-	lo->lo_encrypt_key_size = info->lo_encrypt_key_size;
-	lo->lo_init[0] = info->lo_init[0];
-	lo->lo_init[1] = info->lo_init[1];
-	if (info->lo_encrypt_key_size) {
-		memcpy(lo->lo_encrypt_key, info->lo_encrypt_key,
-		       info->lo_encrypt_key_size);
-		lo->lo_key_owner = uid;
-	}
-
 	if (size_changed)
 		loop_set_size(lo, validated_size);
 
-- 
2.26.1.301.g55bc3eb7cb9-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/4] loop: Move loop_set_from_status() and friends up.
  2020-04-20  8:04 [PATCH 0/4] Add a new LOOP_SET_FD_AND_STATUS ioctl Martijn Coenen
  2020-04-20  8:04 ` [PATCH 1/4] loop: Refactor size calculation Martijn Coenen
  2020-04-20  8:04 ` [PATCH 2/4] loop: Factor out configuring loop from status Martijn Coenen
@ 2020-04-20  8:04 ` Martijn Coenen
  2020-04-20 13:43   ` Bart Van Assche
  2020-04-20  8:04 ` [PATCH 4/4] loop: Add LOOP_SET_FD_AND_STATUS ioctl Martijn Coenen
  3 siblings, 1 reply; 17+ messages in thread
From: Martijn Coenen @ 2020-04-20  8:04 UTC (permalink / raw)
  To: axboe, hch, ming.lei
  Cc: narayan, zezeozue, kernel-team, linux-block, linux-kernel, maco,
	bvanassche, Chaitanya.Kulkarni, Martijn Coenen

So we can use it without forward declaration. This is a separate commit
to make it easier to verify that this is just a move, without functional
modifications.

Signed-off-by: Martijn Coenen <maco@android.com>
---
 drivers/block/loop.c | 188 +++++++++++++++++++++----------------------
 1 file changed, 94 insertions(+), 94 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index e0f9674fe568..6e656390b285 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -971,6 +971,100 @@ static void loop_update_rotational(struct loop_device *lo)
 		blk_queue_flag_clear(QUEUE_FLAG_NONROT, q);
 }
 
+static int
+loop_release_xfer(struct loop_device *lo)
+{
+	int err = 0;
+	struct loop_func_table *xfer = lo->lo_encryption;
+
+	if (xfer) {
+		if (xfer->release)
+			err = xfer->release(lo);
+		lo->transfer = NULL;
+		lo->lo_encryption = NULL;
+		module_put(xfer->owner);
+	}
+	return err;
+}
+
+static int
+loop_init_xfer(struct loop_device *lo, struct loop_func_table *xfer,
+	       const struct loop_info64 *i)
+{
+	int err = 0;
+
+	if (xfer) {
+		struct module *owner = xfer->owner;
+
+		if (!try_module_get(owner))
+			return -EINVAL;
+		if (xfer->init)
+			err = xfer->init(lo, i);
+		if (err)
+			module_put(owner);
+		else
+			lo->lo_encryption = xfer;
+	}
+	return err;
+}
+
+static int
+loop_set_from_status(struct loop_device *lo, const struct loop_info64 *info)
+{
+	int err;
+	struct loop_func_table *xfer;
+	kuid_t uid = current_uid();
+
+	if ((unsigned int) info->lo_encrypt_key_size > LO_KEY_SIZE)
+		return -EINVAL;
+
+	err = loop_release_xfer(lo);
+	if (err)
+		return err;
+
+	if (info->lo_encrypt_type) {
+		unsigned int type = info->lo_encrypt_type;
+
+		if (type >= MAX_LO_CRYPT)
+			return -EINVAL;
+		xfer = xfer_funcs[type];
+		if (xfer == NULL)
+			return -EINVAL;
+	} else
+		xfer = NULL;
+
+	err = loop_init_xfer(lo, xfer, info);
+	if (err)
+		return err;
+
+	lo->lo_offset = info->lo_offset;
+	lo->lo_sizelimit = info->lo_sizelimit;
+	memcpy(lo->lo_file_name, info->lo_file_name, LO_NAME_SIZE);
+	memcpy(lo->lo_crypt_name, info->lo_crypt_name, LO_NAME_SIZE);
+	lo->lo_file_name[LO_NAME_SIZE-1] = 0;
+	lo->lo_crypt_name[LO_NAME_SIZE-1] = 0;
+
+	if (!xfer)
+		xfer = &none_funcs;
+	lo->transfer = xfer->transfer;
+	lo->ioctl = xfer->ioctl;
+
+	if ((lo->lo_flags & LO_FLAGS_AUTOCLEAR) !=
+	     (info->lo_flags & LO_FLAGS_AUTOCLEAR))
+		lo->lo_flags ^= LO_FLAGS_AUTOCLEAR;
+
+	lo->lo_encrypt_key_size = info->lo_encrypt_key_size;
+	lo->lo_init[0] = info->lo_init[0];
+	lo->lo_init[1] = info->lo_init[1];
+	if (info->lo_encrypt_key_size) {
+		memcpy(lo->lo_encrypt_key, info->lo_encrypt_key,
+		       info->lo_encrypt_key_size);
+		lo->lo_key_owner = uid;
+	}
+
+	return 0;
+}
+
 static int loop_set_fd(struct loop_device *lo, fmode_t mode,
 		       struct block_device *bdev, unsigned int arg)
 {
@@ -1094,43 +1188,6 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
 	return error;
 }
 
-static int
-loop_release_xfer(struct loop_device *lo)
-{
-	int err = 0;
-	struct loop_func_table *xfer = lo->lo_encryption;
-
-	if (xfer) {
-		if (xfer->release)
-			err = xfer->release(lo);
-		lo->transfer = NULL;
-		lo->lo_encryption = NULL;
-		module_put(xfer->owner);
-	}
-	return err;
-}
-
-static int
-loop_init_xfer(struct loop_device *lo, struct loop_func_table *xfer,
-	       const struct loop_info64 *i)
-{
-	int err = 0;
-
-	if (xfer) {
-		struct module *owner = xfer->owner;
-
-		if (!try_module_get(owner))
-			return -EINVAL;
-		if (xfer->init)
-			err = xfer->init(lo, i);
-		if (err)
-			module_put(owner);
-		else
-			lo->lo_encryption = xfer;
-	}
-	return err;
-}
-
 static int __loop_clr_fd(struct loop_device *lo, bool release)
 {
 	struct file *filp = NULL;
@@ -1275,63 +1332,6 @@ static int loop_clr_fd(struct loop_device *lo)
 	return __loop_clr_fd(lo, false);
 }
 
-static int
-loop_set_from_status(struct loop_device *lo, const struct loop_info64 *info)
-{
-	int err;
-	struct loop_func_table *xfer;
-	kuid_t uid = current_uid();
-
-	if ((unsigned int) info->lo_encrypt_key_size > LO_KEY_SIZE)
-		return -EINVAL;
-
-	err = loop_release_xfer(lo);
-	if (err)
-		return err;
-
-	if (info->lo_encrypt_type) {
-		unsigned int type = info->lo_encrypt_type;
-
-		if (type >= MAX_LO_CRYPT)
-			return -EINVAL;
-		xfer = xfer_funcs[type];
-		if (xfer == NULL)
-			return -EINVAL;
-	} else
-		xfer = NULL;
-
-	err = loop_init_xfer(lo, xfer, info);
-	if (err)
-		return err;
-
-	lo->lo_offset = info->lo_offset;
-	lo->lo_sizelimit = info->lo_sizelimit;
-	memcpy(lo->lo_file_name, info->lo_file_name, LO_NAME_SIZE);
-	memcpy(lo->lo_crypt_name, info->lo_crypt_name, LO_NAME_SIZE);
-	lo->lo_file_name[LO_NAME_SIZE-1] = 0;
-	lo->lo_crypt_name[LO_NAME_SIZE-1] = 0;
-
-	if (!xfer)
-		xfer = &none_funcs;
-	lo->transfer = xfer->transfer;
-	lo->ioctl = xfer->ioctl;
-
-	if ((lo->lo_flags & LO_FLAGS_AUTOCLEAR) !=
-	     (info->lo_flags & LO_FLAGS_AUTOCLEAR))
-		lo->lo_flags ^= LO_FLAGS_AUTOCLEAR;
-
-	lo->lo_encrypt_key_size = info->lo_encrypt_key_size;
-	lo->lo_init[0] = info->lo_init[0];
-	lo->lo_init[1] = info->lo_init[1];
-	if (info->lo_encrypt_key_size) {
-		memcpy(lo->lo_encrypt_key, info->lo_encrypt_key,
-		       info->lo_encrypt_key_size);
-		lo->lo_key_owner = uid;
-	}
-
-	return 0;
-}
-
 static int
 loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
 {
-- 
2.26.1.301.g55bc3eb7cb9-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/4] loop: Add LOOP_SET_FD_AND_STATUS ioctl.
  2020-04-20  8:04 [PATCH 0/4] Add a new LOOP_SET_FD_AND_STATUS ioctl Martijn Coenen
                   ` (2 preceding siblings ...)
  2020-04-20  8:04 ` [PATCH 3/4] loop: Move loop_set_from_status() and friends up Martijn Coenen
@ 2020-04-20  8:04 ` Martijn Coenen
  2020-04-22  6:19   ` Christoph Hellwig
  2020-04-22 15:10   ` Bart Van Assche
  3 siblings, 2 replies; 17+ messages in thread
From: Martijn Coenen @ 2020-04-20  8:04 UTC (permalink / raw)
  To: axboe, hch, ming.lei
  Cc: narayan, zezeozue, kernel-team, linux-block, linux-kernel, maco,
	bvanassche, Chaitanya.Kulkarni, Martijn Coenen

This allows userspace to completely setup a loop device with a single
ioctl, removing the in-between state where the device can be partially
configured - eg the loop device has a backing file associated with it,
but is reading from the wrong offset.

Besides removing the intermediate state, another big benefit of this
ioctl is that LOOP_SET_STATUS can be slow; the main reason for this
slowness is that LOOP_SET_STATUS(64) calls blk_mq_freeze_queue() to
freeze the associated queue; this requires waiting for RCU
synchronization, which I've measured can take about 15-20ms on this
device on average.

Here's setting up ~70 regular loop devices with an offset on an x86
Android device, using LOOP_SET_FD and LOOP_SET_STATUS:

vsoc_x86:/system/apex # time for i in `seq 30 100`;
do losetup -r -o 4096 /dev/block/loop$i com.android.adbd.apex; done
    0m03.40s real     0m00.02s user     0m00.03s system

Here's configuring ~70 devices in the same way, but using a modified
losetup that uses the new LOOP_SET_FD_AND_STATUS ioctl:

vsoc_x86:/system/apex # time for i in `seq 30 100`;
do losetup -r -o 4096 /dev/block/loop$i com.android.adbd.apex; done
    0m01.94s real     0m00.01s user     0m00.01s system

Signed-off-by: Martijn Coenen <maco@android.com>
---
 drivers/block/loop.c      | 47 ++++++++++++++++++++++++++++++---------
 include/uapi/linux/loop.h |  6 +++++
 2 files changed, 42 insertions(+), 11 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 6e656390b285..e1dbd70d6d6e 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1065,8 +1065,9 @@ loop_set_from_status(struct loop_device *lo, const struct loop_info64 *info)
 	return 0;
 }
 
-static int loop_set_fd(struct loop_device *lo, fmode_t mode,
-		       struct block_device *bdev, unsigned int arg)
+static int loop_set_fd_and_status(struct loop_device *lo, fmode_t mode,
+				  struct block_device *bdev, unsigned int fd,
+				  const struct loop_info64 *info)
 {
 	struct file	*file;
 	struct inode	*inode;
@@ -1081,7 +1082,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
 	__module_get(THIS_MODULE);
 
 	error = -EBADF;
-	file = fget(arg);
+	file = fget(fd);
 	if (!file)
 		goto out;
 
@@ -1090,7 +1091,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
 	 * here to avoid changing device under exclusive owner.
 	 */
 	if (!(mode & FMODE_EXCL)) {
-		claimed_bdev = bd_start_claiming(bdev, loop_set_fd);
+		claimed_bdev = bd_start_claiming(bdev, loop_set_fd_and_status);
 		if (IS_ERR(claimed_bdev)) {
 			error = PTR_ERR(claimed_bdev);
 			goto out_putf;
@@ -1117,9 +1118,24 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
 		lo_flags |= LO_FLAGS_READ_ONLY;
 
 	error = -EFBIG;
-	size = get_loop_size(lo, file);
+	if (info)
+		size = get_size(info->lo_offset, info->lo_sizelimit,
+				file);
+	else
+		size = get_loop_size(lo, file);
 	if ((loff_t)(sector_t)size != size)
 		goto out_unlock;
+
+	if (info) {
+		error = loop_set_from_status(lo, info);
+		if (error)
+			goto out_unlock;
+	} else {
+		lo->transfer = NULL;
+		lo->ioctl = NULL;
+		lo->lo_sizelimit = 0;
+		lo->lo_offset = 0;
+	}
 	error = loop_prepare_queue(lo);
 	if (error)
 		goto out_unlock;
@@ -1132,9 +1148,6 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
 	lo->lo_device = bdev;
 	lo->lo_flags = lo_flags;
 	lo->lo_backing_file = file;
-	lo->transfer = NULL;
-	lo->ioctl = NULL;
-	lo->lo_sizelimit = 0;
 	lo->old_gfp_mask = mapping_gfp_mask(mapping);
 	mapping_set_gfp_mask(mapping, lo->old_gfp_mask & ~(__GFP_IO|__GFP_FS));
 
@@ -1172,14 +1185,14 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
 	if (partscan)
 		loop_reread_partitions(lo, bdev);
 	if (claimed_bdev)
-		bd_abort_claiming(bdev, claimed_bdev, loop_set_fd);
+		bd_abort_claiming(bdev, claimed_bdev, loop_set_fd_and_status);
 	return 0;
 
 out_unlock:
 	mutex_unlock(&loop_ctl_mutex);
 out_bdev:
 	if (claimed_bdev)
-		bd_abort_claiming(bdev, claimed_bdev, loop_set_fd);
+		bd_abort_claiming(bdev, claimed_bdev, loop_set_fd_and_status);
 out_putf:
 	fput(file);
 out:
@@ -1653,7 +1666,18 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode,
 
 	switch (cmd) {
 	case LOOP_SET_FD:
-		return loop_set_fd(lo, mode, bdev, arg);
+		return loop_set_fd_and_status(lo, mode, bdev, arg, NULL);
+	case LOOP_SET_FD_AND_STATUS: {
+		struct loop_fd_and_status fds;
+
+		if (copy_from_user(&fds,
+				   (struct loop_fd_and_status __user *) arg,
+				   sizeof(fds)))
+			return -EFAULT;
+
+		return loop_set_fd_and_status(lo, mode, bdev, fds.fd,
+					      &fds.info);
+	}
 	case LOOP_CHANGE_FD:
 		return loop_change_fd(lo, bdev, arg);
 	case LOOP_CLR_FD:
@@ -1827,6 +1851,7 @@ static int lo_compat_ioctl(struct block_device *bdev, fmode_t mode,
 	case LOOP_CLR_FD:
 	case LOOP_GET_STATUS64:
 	case LOOP_SET_STATUS64:
+	case LOOP_SET_FD_AND_STATUS:
 		arg = (unsigned long) compat_ptr(arg);
 		/* fall through */
 	case LOOP_SET_FD:
diff --git a/include/uapi/linux/loop.h b/include/uapi/linux/loop.h
index 080a8df134ef..fcc9a693b588 100644
--- a/include/uapi/linux/loop.h
+++ b/include/uapi/linux/loop.h
@@ -60,6 +60,11 @@ struct loop_info64 {
 	__u64		   lo_init[2];
 };
 
+struct loop_fd_and_status {
+	struct loop_info64	info;
+	__u32			fd;
+};
+
 /*
  * Loop filter types
  */
@@ -90,6 +95,7 @@ struct loop_info64 {
 #define LOOP_SET_CAPACITY	0x4C07
 #define LOOP_SET_DIRECT_IO	0x4C08
 #define LOOP_SET_BLOCK_SIZE	0x4C09
+#define LOOP_SET_FD_AND_STATUS	0x4C0A
 
 /* /dev/loop-control interface */
 #define LOOP_CTL_ADD		0x4C80
-- 
2.26.1.301.g55bc3eb7cb9-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] loop: Refactor size calculation.
  2020-04-20  8:04 ` [PATCH 1/4] loop: Refactor size calculation Martijn Coenen
@ 2020-04-20 13:22   ` Bart Van Assche
  2020-04-21 11:48     ` Martijn Coenen
  0 siblings, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2020-04-20 13:22 UTC (permalink / raw)
  To: Martijn Coenen, axboe, hch, ming.lei
  Cc: narayan, zezeozue, kernel-team, linux-block, linux-kernel, maco,
	Chaitanya.Kulkarni, Jaegeuk Kim

On 4/20/20 1:04 AM, Martijn Coenen wrote:
> +/**
> + * loop_set_size - sets device size and notifies userspace
> + * @lo: struct loop_device to set the size for
> + * @size: new size of the loop device
> + *
> + * Callers must validate that the size passed into this function fits into
> + * a sector_t.
> + */
> +static void loop_set_size(struct loop_device *lo, loff_t size)
> +{
> +	struct block_device *bdev = lo->lo_device;
> +
> +	set_capacity(lo->lo_disk, size);
> +	bd_set_size(bdev, size << 9);
> +	/* let user-space know about the new size */
> +	kobject_uevent(&disk_to_dev(bdev->bd_disk)->kobj, KOBJ_CHANGE);
> +}

How about using the SECTOR_SHIFT constant instead of "9"? I think the 
following has been added recently in include/linux/blkdev.h:

#define SECTOR_SHIFT 9

> @@ -1295,6 +1321,15 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
>   	/* I/O need to be drained during transfer transition */
>   	blk_mq_freeze_queue(lo->lo_queue);
>   
> +	if (size_changed && lo->lo_device->bd_inode->i_mapping->nrpages) {
> +		/* kill_bdev should have truncated all the pages */
> +		err = -EAGAIN;
> +		pr_warn("%s: loop%d (%s) has still dirty pages (nrpages=%lu)\n",
> +			__func__, lo->lo_number, lo->lo_file_name,
> +			lo->lo_device->bd_inode->i_mapping->nrpages);
> +		goto out_unfreeze;
> +	}
> +
>   	err = loop_release_xfer(lo);
>   	if (err)
>   		goto out_unfreeze;

Please Cc Jaegeuk for any changes in this code (see also 
https://lore.kernel.org/linux-block/20190518005304.GA19446@jaegeuk-macbookpro.roam.corp.google.com/).

Please also change the "kill_bdev should have truncated all the pages" 
comment into something like "return -EAGAIN if any pages have been 
dirtied after kill_bdev() returned".

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] loop: Factor out configuring loop from status.
  2020-04-20  8:04 ` [PATCH 2/4] loop: Factor out configuring loop from status Martijn Coenen
@ 2020-04-20 13:34   ` Bart Van Assche
  2020-04-20 13:49   ` Bart Van Assche
  1 sibling, 0 replies; 17+ messages in thread
From: Bart Van Assche @ 2020-04-20 13:34 UTC (permalink / raw)
  To: Martijn Coenen, axboe, hch, ming.lei
  Cc: narayan, zezeozue, kernel-team, linux-block, linux-kernel, maco,
	Chaitanya.Kulkarni

On 4/20/20 1:04 AM, Martijn Coenen wrote:

No trailing dot at the end of a patch subject please.

>   static int
> -loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
> +loop_set_from_status(struct loop_device *lo, const struct loop_info64 *info)
>   {
>   	int err;
>   	struct loop_func_table *xfer;
>   	kuid_t uid = current_uid();
> +
> +	if ((unsigned int) info->lo_encrypt_key_size > LO_KEY_SIZE)
> +		return -EINVAL;
> +
> +	err = loop_release_xfer(lo);
> +	if (err)
> +		return err;
> +
> +	if (info->lo_encrypt_type) {
> +		unsigned int type = info->lo_encrypt_type;
> +
> +		if (type >= MAX_LO_CRYPT)
> +			return -EINVAL;
> +		xfer = xfer_funcs[type];
> +		if (xfer == NULL)
> +			return -EINVAL;
> +	} else
> +		xfer = NULL;
> +
> +	err = loop_init_xfer(lo, xfer, info);
> +	if (err)
> +		return err;
> +
> +	lo->lo_offset = info->lo_offset;
> +	lo->lo_sizelimit = info->lo_sizelimit;
> +	memcpy(lo->lo_file_name, info->lo_file_name, LO_NAME_SIZE);
> +	memcpy(lo->lo_crypt_name, info->lo_crypt_name, LO_NAME_SIZE);
> +	lo->lo_file_name[LO_NAME_SIZE-1] = 0;
> +	lo->lo_crypt_name[LO_NAME_SIZE-1] = 0;
> +
> +	if (!xfer)
> +		xfer = &none_funcs;
> +	lo->transfer = xfer->transfer;
> +	lo->ioctl = xfer->ioctl;
> +
> +	if ((lo->lo_flags & LO_FLAGS_AUTOCLEAR) !=
> +	     (info->lo_flags & LO_FLAGS_AUTOCLEAR))
> +		lo->lo_flags ^= LO_FLAGS_AUTOCLEAR;
> +
> +	lo->lo_encrypt_key_size = info->lo_encrypt_key_size;
> +	lo->lo_init[0] = info->lo_init[0];
> +	lo->lo_init[1] = info->lo_init[1];
> +	if (info->lo_encrypt_key_size) {
> +		memcpy(lo->lo_encrypt_key, info->lo_encrypt_key,
> +		       info->lo_encrypt_key_size);
> +		lo->lo_key_owner = uid;
> +	}
> +
> +	return 0;
> +}

Please add a (one line?) comment above this function that explains the 
purpose of this function. Is the purpose of this function perhaps to 
initialize loop device parameters based on the information received from 
user space (the 'info' argument)?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/4] loop: Move loop_set_from_status() and friends up.
  2020-04-20  8:04 ` [PATCH 3/4] loop: Move loop_set_from_status() and friends up Martijn Coenen
@ 2020-04-20 13:43   ` Bart Van Assche
  0 siblings, 0 replies; 17+ messages in thread
From: Bart Van Assche @ 2020-04-20 13:43 UTC (permalink / raw)
  To: Martijn Coenen, axboe, hch, ming.lei
  Cc: narayan, zezeozue, kernel-team, linux-block, linux-kernel, maco,
	Chaitanya.Kulkarni

On 4/20/20 1:04 AM, Martijn Coenen wrote:
> So we can use it without forward declaration. This is a separate commit
> to make it easier to verify that this is just a move, without functional
> modifications.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] loop: Factor out configuring loop from status.
  2020-04-20  8:04 ` [PATCH 2/4] loop: Factor out configuring loop from status Martijn Coenen
  2020-04-20 13:34   ` Bart Van Assche
@ 2020-04-20 13:49   ` Bart Van Assche
  2020-04-21 11:46     ` Martijn Coenen
  1 sibling, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2020-04-20 13:49 UTC (permalink / raw)
  To: Martijn Coenen, axboe, hch, ming.lei
  Cc: narayan, zezeozue, kernel-team, linux-block, linux-kernel, maco,
	Chaitanya.Kulkarni

On 4/20/20 1:04 AM, Martijn Coenen wrote:
>   static int
> -loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
> +loop_set_from_status(struct loop_device *lo, const struct loop_info64 *info)
>   {

An additional question: since this function sets the status of 'lo' 
based on the information in 'info', would "loop_set_status" or 
"loop_set_status_from_info" be a better name for this function?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/4] loop: Factor out configuring loop from status.
  2020-04-20 13:49   ` Bart Van Assche
@ 2020-04-21 11:46     ` Martijn Coenen
  0 siblings, 0 replies; 17+ messages in thread
From: Martijn Coenen @ 2020-04-21 11:46 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, Christoph Hellwig, Ming Lei, Narayan Kamath,
	Zimuzo Ezeozue, kernel-team, linux-block, LKML, Martijn Coenen,
	Chaitanya Kulkarni

On Mon, Apr 20, 2020 at 3:49 PM Bart Van Assche <bvanassche@acm.org> wrote:
>
> An additional question: since this function sets the status of 'lo'
> based on the information in 'info', would "loop_set_status" or
> "loop_set_status_from_info" be a better name for this function?

Yeah, I like the latter. I will rename, and add a comment that
explains the purpose more clearly.

Thanks,
Martijn

>
> Thanks,
>
> Bart.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] loop: Refactor size calculation.
  2020-04-20 13:22   ` Bart Van Assche
@ 2020-04-21 11:48     ` Martijn Coenen
  2020-04-21 15:26       ` Bart Van Assche
  0 siblings, 1 reply; 17+ messages in thread
From: Martijn Coenen @ 2020-04-21 11:48 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, Christoph Hellwig, Ming Lei, Narayan Kamath,
	Zimuzo Ezeozue, kernel-team, linux-block, LKML, Martijn Coenen,
	Chaitanya Kulkarni, Jaegeuk Kim

On Mon, Apr 20, 2020 at 3:22 PM Bart Van Assche <bvanassche@acm.org> wrote:
> How about using the SECTOR_SHIFT constant instead of "9"?

Ack, will do.

> Please also change the "kill_bdev should have truncated all the pages"
> comment into something like "return -EAGAIN if any pages have been
> dirtied after kill_bdev() returned".

Sure - would you prefer this to be in a separate change?

Thanks,
Martijn

>
> Thanks,
>
> Bart.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/4] loop: Refactor size calculation.
  2020-04-21 11:48     ` Martijn Coenen
@ 2020-04-21 15:26       ` Bart Van Assche
  0 siblings, 0 replies; 17+ messages in thread
From: Bart Van Assche @ 2020-04-21 15:26 UTC (permalink / raw)
  To: Martijn Coenen
  Cc: Jens Axboe, Christoph Hellwig, Ming Lei, Narayan Kamath,
	Zimuzo Ezeozue, kernel-team, linux-block, LKML, Martijn Coenen,
	Chaitanya Kulkarni, Jaegeuk Kim

On 4/21/20 4:48 AM, Martijn Coenen wrote:
> On Mon, Apr 20, 2020 at 3:22 PM Bart Van Assche <bvanassche@acm.org> wrote:
>> Please also change the "kill_bdev should have truncated all the pages"
>> comment into something like "return -EAGAIN if any pages have been
>> dirtied after kill_bdev() returned".
> 
> Sure - would you prefer this to be in a separate change?

Since the comment "kill_bdev should have truncated all the pages" has to 
be moved, I think that it's fine to integrate the fix for that comment 
in this patch.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] loop: Add LOOP_SET_FD_AND_STATUS ioctl.
  2020-04-20  8:04 ` [PATCH 4/4] loop: Add LOOP_SET_FD_AND_STATUS ioctl Martijn Coenen
@ 2020-04-22  6:19   ` Christoph Hellwig
  2020-04-22  8:06     ` Martijn Coenen
  2020-04-22 15:10   ` Bart Van Assche
  1 sibling, 1 reply; 17+ messages in thread
From: Christoph Hellwig @ 2020-04-22  6:19 UTC (permalink / raw)
  To: Martijn Coenen
  Cc: axboe, hch, ming.lei, narayan, zezeozue, kernel-team,
	linux-block, linux-kernel, maco, bvanassche, Chaitanya.Kulkarni

On Mon, Apr 20, 2020 at 10:04:09AM +0200, Martijn Coenen wrote:
> This allows userspace to completely setup a loop device with a single
> ioctl, removing the in-between state where the device can be partially
> configured - eg the loop device has a backing file associated with it,
> but is reading from the wrong offset.
> 
> Besides removing the intermediate state, another big benefit of this
> ioctl is that LOOP_SET_STATUS can be slow; the main reason for this
> slowness is that LOOP_SET_STATUS(64) calls blk_mq_freeze_queue() to
> freeze the associated queue; this requires waiting for RCU
> synchronization, which I've measured can take about 15-20ms on this
> device on average.
> 
> Here's setting up ~70 regular loop devices with an offset on an x86
> Android device, using LOOP_SET_FD and LOOP_SET_STATUS:
> 
> vsoc_x86:/system/apex # time for i in `seq 30 100`;
> do losetup -r -o 4096 /dev/block/loop$i com.android.adbd.apex; done
>     0m03.40s real     0m00.02s user     0m00.03s system
> 
> Here's configuring ~70 devices in the same way, but using a modified
> losetup that uses the new LOOP_SET_FD_AND_STATUS ioctl:
> 
> vsoc_x86:/system/apex # time for i in `seq 30 100`;
> do losetup -r -o 4096 /dev/block/loop$i com.android.adbd.apex; done
>     0m01.94s real     0m00.01s user     0m00.01s system
> 
> Signed-off-by: Martijn Coenen <maco@android.com>
> ---
>  drivers/block/loop.c      | 47 ++++++++++++++++++++++++++++++---------
>  include/uapi/linux/loop.h |  6 +++++
>  2 files changed, 42 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index 6e656390b285..e1dbd70d6d6e 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -1065,8 +1065,9 @@ loop_set_from_status(struct loop_device *lo, const struct loop_info64 *info)
>  	return 0;
>  }
>  
> -static int loop_set_fd(struct loop_device *lo, fmode_t mode,
> -		       struct block_device *bdev, unsigned int arg)
> +static int loop_set_fd_and_status(struct loop_device *lo, fmode_t mode,
> +				  struct block_device *bdev, unsigned int fd,
> +				  const struct loop_info64 *info)
>  {
>  	struct file	*file;
>  	struct inode	*inode;
> @@ -1081,7 +1082,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
>  	__module_get(THIS_MODULE);
>  
>  	error = -EBADF;
> -	file = fget(arg);
> +	file = fget(fd);
>  	if (!file)
>  		goto out;
>  
> @@ -1090,7 +1091,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
>  	 * here to avoid changing device under exclusive owner.
>  	 */
>  	if (!(mode & FMODE_EXCL)) {
> -		claimed_bdev = bd_start_claiming(bdev, loop_set_fd);
> +		claimed_bdev = bd_start_claiming(bdev, loop_set_fd_and_status);
>  		if (IS_ERR(claimed_bdev)) {
>  			error = PTR_ERR(claimed_bdev);
>  			goto out_putf;
> @@ -1117,9 +1118,24 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
>  		lo_flags |= LO_FLAGS_READ_ONLY;
>  
>  	error = -EFBIG;
> -	size = get_loop_size(lo, file);
> +	if (info)
> +		size = get_size(info->lo_offset, info->lo_sizelimit,
> +				file);
> +	else
> +		size = get_loop_size(lo, file);
>  	if ((loff_t)(sector_t)size != size)
>  		goto out_unlock;
> +
> +	if (info) {
> +		error = loop_set_from_status(lo, info);
> +		if (error)
> +			goto out_unlock;
> +	} else {
> +		lo->transfer = NULL;
> +		lo->ioctl = NULL;
> +		lo->lo_sizelimit = 0;
> +		lo->lo_offset = 0;
> +	}

Just curious:  Can't we just pass in an on-stack info for the legacy
case and avoid all these conditionals?

>  out:
> @@ -1653,7 +1666,18 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode,
>  
>  	switch (cmd) {
>  	case LOOP_SET_FD:
> -		return loop_set_fd(lo, mode, bdev, arg);
> +		return loop_set_fd_and_status(lo, mode, bdev, arg, NULL);
> +	case LOOP_SET_FD_AND_STATUS: {
> +		struct loop_fd_and_status fds;
> +
> +		if (copy_from_user(&fds,
> +				   (struct loop_fd_and_status __user *) arg,
> +				   sizeof(fds)))
> +			return -EFAULT;
> +
> +		return loop_set_fd_and_status(lo, mode, bdev, fds.fd,
> +					      &fds.info);
> +	}

Can you throw in another prep patch that adds a:

	void __user *argp = (void __user *)arg;

line at the top of lo_compat_ioctl, and switches the LOOP_SET_STATUS
and LOOP_GET_STATUS case to it?

The nhere you can just do:

		if (copy_from_user(&p, argp, sizeof(fds)))
			return -EFAULT;
		return loop_set_fd_and_status(lo, mode, bdev, p.fd, &p.info);

and be done.

> +struct loop_fd_and_status {
> +	struct loop_info64	info;
> +	__u32			fd;

This should grow a

	__u32	__pad;

to avoid different struct sizes on x86-32 vs x86-64.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] loop: Add LOOP_SET_FD_AND_STATUS ioctl.
  2020-04-22  6:19   ` Christoph Hellwig
@ 2020-04-22  8:06     ` Martijn Coenen
  2020-04-22  8:07       ` Christoph Hellwig
  0 siblings, 1 reply; 17+ messages in thread
From: Martijn Coenen @ 2020-04-22  8:06 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Ming Lei, Narayan Kamath, Zimuzo Ezeozue,
	kernel-team, linux-block, LKML, Martijn Coenen, Bart Van Assche,
	Chaitanya Kulkarni

On Wed, Apr 22, 2020 at 8:19 AM Christoph Hellwig <hch@lst.de> wrote:
>
> On Mon, Apr 20, 2020 at 10:04:09AM +0200, Martijn Coenen wrote:
> > This allows userspace to completely setup a loop device with a single
> > ioctl, removing the in-between state where the device can be partially
> > configured - eg the loop device has a backing file associated with it,
> > but is reading from the wrong offset.
> >
> > Besides removing the intermediate state, another big benefit of this
> > ioctl is that LOOP_SET_STATUS can be slow; the main reason for this
> > slowness is that LOOP_SET_STATUS(64) calls blk_mq_freeze_queue() to
> > freeze the associated queue; this requires waiting for RCU
> > synchronization, which I've measured can take about 15-20ms on this
> > device on average.
> >
> > Here's setting up ~70 regular loop devices with an offset on an x86
> > Android device, using LOOP_SET_FD and LOOP_SET_STATUS:
> >
> > vsoc_x86:/system/apex # time for i in `seq 30 100`;
> > do losetup -r -o 4096 /dev/block/loop$i com.android.adbd.apex; done
> >     0m03.40s real     0m00.02s user     0m00.03s system
> >
> > Here's configuring ~70 devices in the same way, but using a modified
> > losetup that uses the new LOOP_SET_FD_AND_STATUS ioctl:
> >
> > vsoc_x86:/system/apex # time for i in `seq 30 100`;
> > do losetup -r -o 4096 /dev/block/loop$i com.android.adbd.apex; done
> >     0m01.94s real     0m00.01s user     0m00.01s system
> >
> > Signed-off-by: Martijn Coenen <maco@android.com>
> > ---
> >  drivers/block/loop.c      | 47 ++++++++++++++++++++++++++++++---------
> >  include/uapi/linux/loop.h |  6 +++++
> >  2 files changed, 42 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> > index 6e656390b285..e1dbd70d6d6e 100644
> > --- a/drivers/block/loop.c
> > +++ b/drivers/block/loop.c
> > @@ -1065,8 +1065,9 @@ loop_set_from_status(struct loop_device *lo, const struct loop_info64 *info)
> >       return 0;
> >  }
> >
> > -static int loop_set_fd(struct loop_device *lo, fmode_t mode,
> > -                    struct block_device *bdev, unsigned int arg)
> > +static int loop_set_fd_and_status(struct loop_device *lo, fmode_t mode,
> > +                               struct block_device *bdev, unsigned int fd,
> > +                               const struct loop_info64 *info)
> >  {
> >       struct file     *file;
> >       struct inode    *inode;
> > @@ -1081,7 +1082,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
> >       __module_get(THIS_MODULE);
> >
> >       error = -EBADF;
> > -     file = fget(arg);
> > +     file = fget(fd);
> >       if (!file)
> >               goto out;
> >
> > @@ -1090,7 +1091,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
> >        * here to avoid changing device under exclusive owner.
> >        */
> >       if (!(mode & FMODE_EXCL)) {
> > -             claimed_bdev = bd_start_claiming(bdev, loop_set_fd);
> > +             claimed_bdev = bd_start_claiming(bdev, loop_set_fd_and_status);
> >               if (IS_ERR(claimed_bdev)) {
> >                       error = PTR_ERR(claimed_bdev);
> >                       goto out_putf;
> > @@ -1117,9 +1118,24 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
> >               lo_flags |= LO_FLAGS_READ_ONLY;
> >
> >       error = -EFBIG;
> > -     size = get_loop_size(lo, file);
> > +     if (info)
> > +             size = get_size(info->lo_offset, info->lo_sizelimit,
> > +                             file);
> > +     else
> > +             size = get_loop_size(lo, file);
> >       if ((loff_t)(sector_t)size != size)
> >               goto out_unlock;
> > +
> > +     if (info) {
> > +             error = loop_set_from_status(lo, info);
> > +             if (error)
> > +                     goto out_unlock;
> > +     } else {
> > +             lo->transfer = NULL;
> > +             lo->ioctl = NULL;
> > +             lo->lo_sizelimit = 0;
> > +             lo->lo_offset = 0;
> > +     }
>
> Just curious:  Can't we just pass in an on-stack info for the legacy
> case and avoid all these conditionals?

Yeah, that is actually much nicer. I will rework it to that.

> Can you throw in another prep patch that adds a:
>
>         void __user *argp = (void __user *)arg;
>
> line at the top of lo_compat_ioctl, and switches the LOOP_SET_STATUS
> and LOOP_GET_STATUS case to it?

Did you mean in regular lo_ioctl()? eg something like this:

@@ -1671,6 +1671,7 @@ static int lo_ioctl(struct block_device *bdev,
fmode_t mode,
        unsigned int cmd, unsigned long arg)
 {
        struct loop_device *lo = bdev->bd_disk->private_data;
+       void __user *argp = (void __user *) arg;
        int err;

        switch (cmd) {
@@ -1694,21 +1695,19 @@ static int lo_ioctl(struct block_device *bdev,
fmode_t mode,
        case LOOP_SET_STATUS:
                err = -EPERM;
                if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) {
-                       err = loop_set_status_old(lo,
-                                       (struct loop_info __user *)arg);
+                       err = loop_set_status_old(lo, argp);
                }
                break;
        case LOOP_GET_STATUS:
-               return loop_get_status_old(lo, (struct loop_info __user *) arg);
+               return loop_get_status_old(lo, argp);
        case LOOP_SET_STATUS64:
                err = -EPERM;
                if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) {
-                       err = loop_set_status64(lo,
-                                       (struct loop_info64 __user *) arg);
+                       err = loop_set_status64(lo, argp);
                }
                break;
        case LOOP_GET_STATUS64:
-               return loop_get_status64(lo, (struct loop_info64 __user *) arg);
+               return loop_get_status64(lo, argp);


> > +struct loop_fd_and_status {
> > +     struct loop_info64      info;
> > +     __u32                   fd;
>
> This should grow a
>
>         __u32   __pad;
>
> to avoid different struct sizes on x86-32 vs x86-64.

will do, thanks!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] loop: Add LOOP_SET_FD_AND_STATUS ioctl.
  2020-04-22  8:06     ` Martijn Coenen
@ 2020-04-22  8:07       ` Christoph Hellwig
  0 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2020-04-22  8:07 UTC (permalink / raw)
  To: Martijn Coenen
  Cc: Christoph Hellwig, Jens Axboe, Ming Lei, Narayan Kamath,
	Zimuzo Ezeozue, kernel-team, linux-block, LKML, Martijn Coenen,
	Bart Van Assche, Chaitanya Kulkarni

On Wed, Apr 22, 2020 at 10:06:17AM +0200, Martijn Coenen wrote:
> > line at the top of lo_compat_ioctl, and switches the LOOP_SET_STATUS
> > and LOOP_GET_STATUS case to it?
> 
> Did you mean in regular lo_ioctl()?

Yes, sorry.

> eg something like this:
> 
> @@ -1671,6 +1671,7 @@ static int lo_ioctl(struct block_device *bdev,
> fmode_t mode,
>         unsigned int cmd, unsigned long arg)
>  {
>         struct loop_device *lo = bdev->bd_disk->private_data;
> +       void __user *argp = (void __user *) arg;
>         int err;
> 
>         switch (cmd) {
> @@ -1694,21 +1695,19 @@ static int lo_ioctl(struct block_device *bdev,
> fmode_t mode,
>         case LOOP_SET_STATUS:
>                 err = -EPERM;
>                 if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) {
> -                       err = loop_set_status_old(lo,
> -                                       (struct loop_info __user *)arg);
> +                       err = loop_set_status_old(lo, argp);
>                 }
>                 break;
>         case LOOP_GET_STATUS:
> -               return loop_get_status_old(lo, (struct loop_info __user *) arg);
> +               return loop_get_status_old(lo, argp);
>         case LOOP_SET_STATUS64:
>                 err = -EPERM;
>                 if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN)) {
> -                       err = loop_set_status64(lo,
> -                                       (struct loop_info64 __user *) arg);
> +                       err = loop_set_status64(lo, argp);
>                 }
>                 break;
>         case LOOP_GET_STATUS64:
> -               return loop_get_status64(lo, (struct loop_info64 __user *) arg);
> +               return loop_get_status64(lo, argp);

Exactly!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] loop: Add LOOP_SET_FD_AND_STATUS ioctl.
  2020-04-20  8:04 ` [PATCH 4/4] loop: Add LOOP_SET_FD_AND_STATUS ioctl Martijn Coenen
  2020-04-22  6:19   ` Christoph Hellwig
@ 2020-04-22 15:10   ` Bart Van Assche
  2020-04-27  7:42     ` Martijn Coenen
  1 sibling, 1 reply; 17+ messages in thread
From: Bart Van Assche @ 2020-04-22 15:10 UTC (permalink / raw)
  To: Martijn Coenen, axboe, hch, ming.lei
  Cc: narayan, zezeozue, kernel-team, linux-block, linux-kernel, maco,
	Chaitanya.Kulkarni

On 4/20/20 1:04 AM, Martijn Coenen wrote:
> diff --git a/include/uapi/linux/loop.h b/include/uapi/linux/loop.h
> index 080a8df134ef..fcc9a693b588 100644
> --- a/include/uapi/linux/loop.h
> +++ b/include/uapi/linux/loop.h
> @@ -60,6 +60,11 @@ struct loop_info64 {
>   	__u64		   lo_init[2];
>   };
>   
> +struct loop_fd_and_status {
> +	struct loop_info64	info;
> +	__u32			fd;
> +};
> +
>   /*
>    * Loop filter types
>    */
> @@ -90,6 +95,7 @@ struct loop_info64 {
>   #define LOOP_SET_CAPACITY	0x4C07
>   #define LOOP_SET_DIRECT_IO	0x4C08
>   #define LOOP_SET_BLOCK_SIZE	0x4C09
> +#define LOOP_SET_FD_AND_STATUS	0x4C0A
>   
>   /* /dev/loop-control interface */
>   #define LOOP_CTL_ADD		0x4C80

Should linux-api be Cc'ed for this patch or the entire patch series? 
 From https://www.kernel.org/doc/man-pages/linux-api-ml.html: "Among 
other things, a primary goal of the list is to help answer the question: 
How do we even know when an interface has been added or changed?".

Thanks,

Bart.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/4] loop: Add LOOP_SET_FD_AND_STATUS ioctl.
  2020-04-22 15:10   ` Bart Van Assche
@ 2020-04-27  7:42     ` Martijn Coenen
  0 siblings, 0 replies; 17+ messages in thread
From: Martijn Coenen @ 2020-04-27  7:42 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, Christoph Hellwig, Ming Lei, Narayan Kamath,
	Zimuzo Ezeozue, kernel-team, linux-block, LKML, Martijn Coenen,
	Chaitanya Kulkarni

On Wed, Apr 22, 2020 at 5:10 PM Bart Van Assche <bvanassche@acm.org> wrote:
> Should linux-api be Cc'ed for this patch or the entire patch series?
>  From https://www.kernel.org/doc/man-pages/linux-api-ml.html: "Among
> other things, a primary goal of the list is to help answer the question:
> How do we even know when an interface has been added or changed?".

Thanks for the suggestion, will add linux-api in the next series.

Martijn

>
> Thanks,
>
> Bart.
>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-04-27  7:42 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-20  8:04 [PATCH 0/4] Add a new LOOP_SET_FD_AND_STATUS ioctl Martijn Coenen
2020-04-20  8:04 ` [PATCH 1/4] loop: Refactor size calculation Martijn Coenen
2020-04-20 13:22   ` Bart Van Assche
2020-04-21 11:48     ` Martijn Coenen
2020-04-21 15:26       ` Bart Van Assche
2020-04-20  8:04 ` [PATCH 2/4] loop: Factor out configuring loop from status Martijn Coenen
2020-04-20 13:34   ` Bart Van Assche
2020-04-20 13:49   ` Bart Van Assche
2020-04-21 11:46     ` Martijn Coenen
2020-04-20  8:04 ` [PATCH 3/4] loop: Move loop_set_from_status() and friends up Martijn Coenen
2020-04-20 13:43   ` Bart Van Assche
2020-04-20  8:04 ` [PATCH 4/4] loop: Add LOOP_SET_FD_AND_STATUS ioctl Martijn Coenen
2020-04-22  6:19   ` Christoph Hellwig
2020-04-22  8:06     ` Martijn Coenen
2020-04-22  8:07       ` Christoph Hellwig
2020-04-22 15:10   ` Bart Van Assche
2020-04-27  7:42     ` Martijn Coenen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).