linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/2] loop: Better discard for block devices
@ 2019-11-14 23:50 Evan Green
  2019-11-14 23:50 ` [PATCH v7 1/2] loop: Report EOPNOTSUPP properly Evan Green
  2019-11-14 23:50 ` [PATCH v7 2/2] loop: Better discard support for block devices Evan Green
  0 siblings, 2 replies; 13+ messages in thread
From: Evan Green @ 2019-11-14 23:50 UTC (permalink / raw)
  To: Jens Axboe, Martin K Petersen
  Cc: Gwendal Grignou, Christoph Hellwig, Ming Lei, Darrick J . Wong,
	Alexis Savery, Douglas Anderson, Bart Van Assche, Evan Green,
	linux-block, linux-kernel

This series addresses some errors seen when using the loop
device directly backed by a block device. The first change plumbs
out the correct error message, and the second change prevents the
error from occurring in many cases.

The errors look like this:
[   90.880875] print_req_error: I/O error, dev loop5, sector 0

The errors occur when trying to do a discard or write zeroes operation
on a loop device backed by a block device that does not support write zeroes.
Firstly, the error itself is incorrectly reported as I/O error, but is
actually EOPNOTSUPP. The first patch plumbs out EOPNOTSUPP to properly
report the error.

The second patch prevents these errors from occurring by mirroring the
zeroing capabilities of the underlying block device into the loop device.
Before this change, discard was always reported as being supported, and
the loop device simply turns around and does an fallocate operation on the
backing device. After this change, backing block devices that do support
zeroing will continue to work as before, and continue to get all the
benefits of doing that. Backing devices that do not support zeroing will
fail earlier, avoiding hitting the loop device at all and ultimately
avoiding this error in the logs.

I can also confirm that this fixes test block/003 in the blktests, when
running blktests on a loop device backed by a block device.

Changes in v7:
- Use errno_to_blk_status() (Christoph)
- Rebase on top of Darrick's patch
- Tweak opening line of commit description (Darrick)

Changes in v6:
- Updated tags

Changes in v5:
- Don't mirror discard if lo_encrypt_key_size is non-zero (Gwendal)

Changes in v4:
- Mirror blkdev's write_zeroes into loopdev's discard_sectors.

Changes in v3:
- Updated tags
- Updated commit description

Changes in v2:
- Unnested error if statement (Bart)

Evan Green (2):
  loop: Report EOPNOTSUPP properly
  loop: Better discard support for block devices

 drivers/block/loop.c | 47 ++++++++++++++++++++++++++++++++------------
 1 file changed, 34 insertions(+), 13 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v7 1/2] loop: Report EOPNOTSUPP properly
  2019-11-14 23:50 [PATCH v7 0/2] loop: Better discard for block devices Evan Green
@ 2019-11-14 23:50 ` Evan Green
  2019-12-02 17:05   ` Gwendal Grignou
  2019-12-03  0:48   ` Bart Van Assche
  2019-11-14 23:50 ` [PATCH v7 2/2] loop: Better discard support for block devices Evan Green
  1 sibling, 2 replies; 13+ messages in thread
From: Evan Green @ 2019-11-14 23:50 UTC (permalink / raw)
  To: Jens Axboe, Martin K Petersen
  Cc: Gwendal Grignou, Christoph Hellwig, Ming Lei, Darrick J . Wong,
	Alexis Savery, Douglas Anderson, Bart Van Assche, Evan Green,
	linux-block, linux-kernel

Properly plumb out EOPNOTSUPP from loop driver operations, which may
get returned when for instance a discard operation is attempted but not
supported by the underlying block device. Before this change, everything
was reported in the log as an I/O error, which is scary and not
helpful in debugging.

Signed-off-by: Evan Green <evgreen@chromium.org>
---

Changes in v7:
- Use errno_to_blk_status() (Christoph)

Changes in v6:
- Updated tags

Changes in v5: None
Changes in v4: None
Changes in v3:
- Updated tags

Changes in v2:
- Unnested error if statement (Bart)

 drivers/block/loop.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index ef6e251857c8..6a9fe1f9fe84 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -461,7 +461,7 @@ static void lo_complete_rq(struct request *rq)
 	if (!cmd->use_aio || cmd->ret < 0 || cmd->ret == blk_rq_bytes(rq) ||
 	    req_op(rq) != REQ_OP_READ) {
 		if (cmd->ret < 0)
-			ret = BLK_STS_IOERR;
+			ret = errno_to_blk_status(cmd->ret);
 		goto end_io;
 	}
 
@@ -1950,7 +1950,10 @@ static void loop_handle_cmd(struct loop_cmd *cmd)
  failed:
 	/* complete non-aio request */
 	if (!cmd->use_aio || ret) {
-		cmd->ret = ret ? -EIO : 0;
+		if (ret == -EOPNOTSUPP)
+			cmd->ret = ret;
+		else
+			cmd->ret = ret ? -EIO : 0;
 		blk_mq_complete_request(rq);
 	}
 }
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v7 2/2] loop: Better discard support for block devices
  2019-11-14 23:50 [PATCH v7 0/2] loop: Better discard for block devices Evan Green
  2019-11-14 23:50 ` [PATCH v7 1/2] loop: Report EOPNOTSUPP properly Evan Green
@ 2019-11-14 23:50 ` Evan Green
  2019-11-20  2:25   ` Darrick J. Wong
  1 sibling, 1 reply; 13+ messages in thread
From: Evan Green @ 2019-11-14 23:50 UTC (permalink / raw)
  To: Jens Axboe, Martin K Petersen
  Cc: Gwendal Grignou, Christoph Hellwig, Ming Lei, Darrick J . Wong,
	Alexis Savery, Douglas Anderson, Bart Van Assche, Evan Green,
	Chaitanya Kulkarni, linux-block, linux-kernel

If the backing device for a loop device is itself a block device,
then mirror the "write zeroes" capabilities of the underlying
block device into the loop device. Copy this capability into both
max_write_zeroes_sectors and max_discard_sectors of the loop device.

The reason for this is that REQ_OP_DISCARD on a loop device translates
into blkdev_issue_zeroout(), rather than blkdev_issue_discard(). This
presents a consistent interface for loop devices (that discarded data
is zeroed), regardless of the backing device type of the loop device.
There should be no behavior change for loop devices backed by regular
files.

This change fixes blktest block/003, and removes an extraneous
error print in block/013 when testing on a loop device backed
by a block device that does not support discard.

Signed-off-by: Evan Green <evgreen@chromium.org>
Reviewed-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
---

Changes in v7:
- Rebase on top of Darrick's patch
- Tweak opening line of commit description (Darrick)

Changes in v6: None
Changes in v5:
- Don't mirror discard if lo_encrypt_key_size is non-zero (Gwendal)

Changes in v4:
- Mirror blkdev's write_zeroes into loopdev's discard_sectors.

Changes in v3:
- Updated commit description

Changes in v2: None

 drivers/block/loop.c | 40 +++++++++++++++++++++++++++++-----------
 1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 6a9fe1f9fe84..e8f23e4b78f7 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -427,11 +427,12 @@ static int lo_fallocate(struct loop_device *lo, struct request *rq, loff_t pos,
 	 * information.
 	 */
 	struct file *file = lo->lo_backing_file;
+	struct request_queue *q = lo->lo_queue;
 	int ret;
 
 	mode |= FALLOC_FL_KEEP_SIZE;
 
-	if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) {
+	if (!blk_queue_discard(q)) {
 		ret = -EOPNOTSUPP;
 		goto out;
 	}
@@ -862,6 +863,21 @@ static void loop_config_discard(struct loop_device *lo)
 	struct file *file = lo->lo_backing_file;
 	struct inode *inode = file->f_mapping->host;
 	struct request_queue *q = lo->lo_queue;
+	struct request_queue *backingq;
+
+	/*
+	 * If the backing device is a block device, mirror its zeroing
+	 * capability. REQ_OP_DISCARD translates to a zero-out even when backed
+	 * by block devices to keep consistent behavior with file-backed loop
+	 * devices.
+	 */
+	if (S_ISBLK(inode->i_mode) && !lo->lo_encrypt_key_size) {
+		backingq = bdev_get_queue(inode->i_bdev);
+		blk_queue_max_discard_sectors(q,
+			backingq->limits.max_write_zeroes_sectors);
+
+		blk_queue_max_write_zeroes_sectors(q,
+			backingq->limits.max_write_zeroes_sectors);
 
 	/*
 	 * We use punch hole to reclaim the free space used by the
@@ -869,22 +885,24 @@ static void loop_config_discard(struct loop_device *lo)
 	 * encryption is enabled, because it may give an attacker
 	 * useful information.
 	 */
-	if ((!file->f_op->fallocate) ||
-	    lo->lo_encrypt_key_size) {
+	} else if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) {
 		q->limits.discard_granularity = 0;
 		q->limits.discard_alignment = 0;
 		blk_queue_max_discard_sectors(q, 0);
 		blk_queue_max_write_zeroes_sectors(q, 0);
-		blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q);
-		return;
-	}
 
-	q->limits.discard_granularity = inode->i_sb->s_blocksize;
-	q->limits.discard_alignment = 0;
+	} else {
+		q->limits.discard_granularity = inode->i_sb->s_blocksize;
+		q->limits.discard_alignment = 0;
 
-	blk_queue_max_discard_sectors(q, UINT_MAX >> 9);
-	blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9);
-	blk_queue_flag_set(QUEUE_FLAG_DISCARD, q);
+		blk_queue_max_discard_sectors(q, UINT_MAX >> 9);
+		blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9);
+	}
+
+	if (q->limits.max_write_zeroes_sectors)
+		blk_queue_flag_set(QUEUE_FLAG_DISCARD, q);
+	else
+		blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q);
 }
 
 static void loop_unprepare_queue(struct loop_device *lo)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v7 2/2] loop: Better discard support for block devices
  2019-11-14 23:50 ` [PATCH v7 2/2] loop: Better discard support for block devices Evan Green
@ 2019-11-20  2:25   ` Darrick J. Wong
  2019-11-20 18:56     ` Evan Green
  0 siblings, 1 reply; 13+ messages in thread
From: Darrick J. Wong @ 2019-11-20  2:25 UTC (permalink / raw)
  To: Evan Green
  Cc: Jens Axboe, Martin K Petersen, Gwendal Grignou,
	Christoph Hellwig, Ming Lei, Alexis Savery, Douglas Anderson,
	Bart Van Assche, Chaitanya Kulkarni, linux-block, linux-kernel

On Thu, Nov 14, 2019 at 03:50:08PM -0800, Evan Green wrote:
> If the backing device for a loop device is itself a block device,
> then mirror the "write zeroes" capabilities of the underlying
> block device into the loop device. Copy this capability into both
> max_write_zeroes_sectors and max_discard_sectors of the loop device.
> 
> The reason for this is that REQ_OP_DISCARD on a loop device translates
> into blkdev_issue_zeroout(), rather than blkdev_issue_discard(). This
> presents a consistent interface for loop devices (that discarded data
> is zeroed), regardless of the backing device type of the loop device.
> There should be no behavior change for loop devices backed by regular
> files.
> 
> This change fixes blktest block/003, and removes an extraneous
> error print in block/013 when testing on a loop device backed
> by a block device that does not support discard.
> 
> Signed-off-by: Evan Green <evgreen@chromium.org>
> Reviewed-by: Gwendal Grignou <gwendal@chromium.org>
> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> ---
> 
> Changes in v7:
> - Rebase on top of Darrick's patch
> - Tweak opening line of commit description (Darrick)
> 
> Changes in v6: None
> Changes in v5:
> - Don't mirror discard if lo_encrypt_key_size is non-zero (Gwendal)
> 
> Changes in v4:
> - Mirror blkdev's write_zeroes into loopdev's discard_sectors.
> 
> Changes in v3:
> - Updated commit description
> 
> Changes in v2: None
> 
>  drivers/block/loop.c | 40 +++++++++++++++++++++++++++++-----------
>  1 file changed, 29 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index 6a9fe1f9fe84..e8f23e4b78f7 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -427,11 +427,12 @@ static int lo_fallocate(struct loop_device *lo, struct request *rq, loff_t pos,
>  	 * information.
>  	 */
>  	struct file *file = lo->lo_backing_file;
> +	struct request_queue *q = lo->lo_queue;
>  	int ret;
>  
>  	mode |= FALLOC_FL_KEEP_SIZE;
>  
> -	if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) {
> +	if (!blk_queue_discard(q)) {
>  		ret = -EOPNOTSUPP;
>  		goto out;
>  	}
> @@ -862,6 +863,21 @@ static void loop_config_discard(struct loop_device *lo)
>  	struct file *file = lo->lo_backing_file;
>  	struct inode *inode = file->f_mapping->host;
>  	struct request_queue *q = lo->lo_queue;
> +	struct request_queue *backingq;
> +
> +	/*
> +	 * If the backing device is a block device, mirror its zeroing
> +	 * capability. REQ_OP_DISCARD translates to a zero-out even when backed
> +	 * by block devices to keep consistent behavior with file-backed loop
> +	 * devices.
> +	 */
> +	if (S_ISBLK(inode->i_mode) && !lo->lo_encrypt_key_size) {
> +		backingq = bdev_get_queue(inode->i_bdev);
> +		blk_queue_max_discard_sectors(q,
> +			backingq->limits.max_write_zeroes_sectors);

max_discard_sectors?

--D

> +
> +		blk_queue_max_write_zeroes_sectors(q,
> +			backingq->limits.max_write_zeroes_sectors);
>  
>  	/*
>  	 * We use punch hole to reclaim the free space used by the
> @@ -869,22 +885,24 @@ static void loop_config_discard(struct loop_device *lo)
>  	 * encryption is enabled, because it may give an attacker
>  	 * useful information.
>  	 */
> -	if ((!file->f_op->fallocate) ||
> -	    lo->lo_encrypt_key_size) {
> +	} else if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) {
>  		q->limits.discard_granularity = 0;
>  		q->limits.discard_alignment = 0;
>  		blk_queue_max_discard_sectors(q, 0);
>  		blk_queue_max_write_zeroes_sectors(q, 0);
> -		blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q);
> -		return;
> -	}
>  
> -	q->limits.discard_granularity = inode->i_sb->s_blocksize;
> -	q->limits.discard_alignment = 0;
> +	} else {
> +		q->limits.discard_granularity = inode->i_sb->s_blocksize;
> +		q->limits.discard_alignment = 0;
>  
> -	blk_queue_max_discard_sectors(q, UINT_MAX >> 9);
> -	blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9);
> -	blk_queue_flag_set(QUEUE_FLAG_DISCARD, q);
> +		blk_queue_max_discard_sectors(q, UINT_MAX >> 9);
> +		blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9);
> +	}
> +
> +	if (q->limits.max_write_zeroes_sectors)
> +		blk_queue_flag_set(QUEUE_FLAG_DISCARD, q);
> +	else
> +		blk_queue_flag_clear(QUEUE_FLAG_DISCARD, q);
>  }
>  
>  static void loop_unprepare_queue(struct loop_device *lo)
> -- 
> 2.21.0
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v7 2/2] loop: Better discard support for block devices
  2019-11-20  2:25   ` Darrick J. Wong
@ 2019-11-20 18:56     ` Evan Green
  2019-11-20 19:13       ` Darrick J. Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Evan Green @ 2019-11-20 18:56 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Jens Axboe, Martin K Petersen, Gwendal Grignou,
	Christoph Hellwig, Ming Lei, Alexis Savery, Douglas Anderson,
	Bart Van Assche, Chaitanya Kulkarni, linux-block, LKML

On Tue, Nov 19, 2019 at 6:25 PM Darrick J. Wong <darrick.wong@oracle.com> wrote:
>
> On Thu, Nov 14, 2019 at 03:50:08PM -0800, Evan Green wrote:
> > If the backing device for a loop device is itself a block device,
> > then mirror the "write zeroes" capabilities of the underlying
> > block device into the loop device. Copy this capability into both
> > max_write_zeroes_sectors and max_discard_sectors of the loop device.
> >
> > The reason for this is that REQ_OP_DISCARD on a loop device translates
> > into blkdev_issue_zeroout(), rather than blkdev_issue_discard(). This
> > presents a consistent interface for loop devices (that discarded data
> > is zeroed), regardless of the backing device type of the loop device.
> > There should be no behavior change for loop devices backed by regular
> > files.
> >
> > This change fixes blktest block/003, and removes an extraneous
> > error print in block/013 when testing on a loop device backed
> > by a block device that does not support discard.
> >
> > Signed-off-by: Evan Green <evgreen@chromium.org>
> > Reviewed-by: Gwendal Grignou <gwendal@chromium.org>
> > Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> > ---
> >
> > Changes in v7:
> > - Rebase on top of Darrick's patch
> > - Tweak opening line of commit description (Darrick)
> >
> > Changes in v6: None
> > Changes in v5:
> > - Don't mirror discard if lo_encrypt_key_size is non-zero (Gwendal)
> >
> > Changes in v4:
> > - Mirror blkdev's write_zeroes into loopdev's discard_sectors.
> >
> > Changes in v3:
> > - Updated commit description
> >
> > Changes in v2: None
> >
> >  drivers/block/loop.c | 40 +++++++++++++++++++++++++++++-----------
> >  1 file changed, 29 insertions(+), 11 deletions(-)
> >
> > diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> > index 6a9fe1f9fe84..e8f23e4b78f7 100644
> > --- a/drivers/block/loop.c
> > +++ b/drivers/block/loop.c
> > @@ -427,11 +427,12 @@ static int lo_fallocate(struct loop_device *lo, struct request *rq, loff_t pos,
> >        * information.
> >        */
> >       struct file *file = lo->lo_backing_file;
> > +     struct request_queue *q = lo->lo_queue;
> >       int ret;
> >
> >       mode |= FALLOC_FL_KEEP_SIZE;
> >
> > -     if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) {
> > +     if (!blk_queue_discard(q)) {
> >               ret = -EOPNOTSUPP;
> >               goto out;
> >       }
> > @@ -862,6 +863,21 @@ static void loop_config_discard(struct loop_device *lo)
> >       struct file *file = lo->lo_backing_file;
> >       struct inode *inode = file->f_mapping->host;
> >       struct request_queue *q = lo->lo_queue;
> > +     struct request_queue *backingq;
> > +
> > +     /*
> > +      * If the backing device is a block device, mirror its zeroing
> > +      * capability. REQ_OP_DISCARD translates to a zero-out even when backed
> > +      * by block devices to keep consistent behavior with file-backed loop
> > +      * devices.
> > +      */
> > +     if (S_ISBLK(inode->i_mode) && !lo->lo_encrypt_key_size) {
> > +             backingq = bdev_get_queue(inode->i_bdev);
> > +             blk_queue_max_discard_sectors(q,
> > +                     backingq->limits.max_write_zeroes_sectors);
>
> max_discard_sectors?

I didn't plumb max_discard_sectors because for my scenario it never
ends up hitting the block device that way.

The loop device either uses FL_ZERO_RANGE or FL_PUNCH_HOLE. When
backed by a block device, that ends up in blkdev_fallocate(), which
always translates both of those into blkdev_issue_zeroout(), not
blkdev_issue_discard(). So it's really the zeroing capabilities of the
block device that matters, even for loop discard operations. It seems
weird, but I think this is the right thing because it presents a
consistent interface to loop device users whether backed by a file
system file, or directly by a block device. That is, a previously
discarded range will read back as zeroes.

-Evan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v7 2/2] loop: Better discard support for block devices
  2019-11-20 18:56     ` Evan Green
@ 2019-11-20 19:13       ` Darrick J. Wong
  2019-11-20 19:25         ` Evan Green
  0 siblings, 1 reply; 13+ messages in thread
From: Darrick J. Wong @ 2019-11-20 19:13 UTC (permalink / raw)
  To: Evan Green
  Cc: Jens Axboe, Martin K Petersen, Gwendal Grignou,
	Christoph Hellwig, Ming Lei, Alexis Savery, Douglas Anderson,
	Bart Van Assche, Chaitanya Kulkarni, linux-block, LKML

On Wed, Nov 20, 2019 at 10:56:30AM -0800, Evan Green wrote:
> On Tue, Nov 19, 2019 at 6:25 PM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> >
> > On Thu, Nov 14, 2019 at 03:50:08PM -0800, Evan Green wrote:
> > > If the backing device for a loop device is itself a block device,
> > > then mirror the "write zeroes" capabilities of the underlying
> > > block device into the loop device. Copy this capability into both
> > > max_write_zeroes_sectors and max_discard_sectors of the loop device.
> > >
> > > The reason for this is that REQ_OP_DISCARD on a loop device translates
> > > into blkdev_issue_zeroout(), rather than blkdev_issue_discard(). This
> > > presents a consistent interface for loop devices (that discarded data
> > > is zeroed), regardless of the backing device type of the loop device.
> > > There should be no behavior change for loop devices backed by regular
> > > files.
> > >
> > > This change fixes blktest block/003, and removes an extraneous
> > > error print in block/013 when testing on a loop device backed
> > > by a block device that does not support discard.
> > >
> > > Signed-off-by: Evan Green <evgreen@chromium.org>
> > > Reviewed-by: Gwendal Grignou <gwendal@chromium.org>
> > > Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> > > ---
> > >
> > > Changes in v7:
> > > - Rebase on top of Darrick's patch
> > > - Tweak opening line of commit description (Darrick)
> > >
> > > Changes in v6: None
> > > Changes in v5:
> > > - Don't mirror discard if lo_encrypt_key_size is non-zero (Gwendal)
> > >
> > > Changes in v4:
> > > - Mirror blkdev's write_zeroes into loopdev's discard_sectors.
> > >
> > > Changes in v3:
> > > - Updated commit description
> > >
> > > Changes in v2: None
> > >
> > >  drivers/block/loop.c | 40 +++++++++++++++++++++++++++++-----------
> > >  1 file changed, 29 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> > > index 6a9fe1f9fe84..e8f23e4b78f7 100644
> > > --- a/drivers/block/loop.c
> > > +++ b/drivers/block/loop.c
> > > @@ -427,11 +427,12 @@ static int lo_fallocate(struct loop_device *lo, struct request *rq, loff_t pos,
> > >        * information.
> > >        */
> > >       struct file *file = lo->lo_backing_file;
> > > +     struct request_queue *q = lo->lo_queue;
> > >       int ret;
> > >
> > >       mode |= FALLOC_FL_KEEP_SIZE;
> > >
> > > -     if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) {
> > > +     if (!blk_queue_discard(q)) {
> > >               ret = -EOPNOTSUPP;
> > >               goto out;
> > >       }
> > > @@ -862,6 +863,21 @@ static void loop_config_discard(struct loop_device *lo)
> > >       struct file *file = lo->lo_backing_file;
> > >       struct inode *inode = file->f_mapping->host;
> > >       struct request_queue *q = lo->lo_queue;
> > > +     struct request_queue *backingq;
> > > +
> > > +     /*
> > > +      * If the backing device is a block device, mirror its zeroing
> > > +      * capability. REQ_OP_DISCARD translates to a zero-out even when backed
> > > +      * by block devices to keep consistent behavior with file-backed loop
> > > +      * devices.
> > > +      */
> > > +     if (S_ISBLK(inode->i_mode) && !lo->lo_encrypt_key_size) {
> > > +             backingq = bdev_get_queue(inode->i_bdev);
> > > +             blk_queue_max_discard_sectors(q,
> > > +                     backingq->limits.max_write_zeroes_sectors);
> >
> > max_discard_sectors?
> 
> I didn't plumb max_discard_sectors because for my scenario it never
> ends up hitting the block device that way.
> 
> The loop device either uses FL_ZERO_RANGE or FL_PUNCH_HOLE. When
> backed by a block device, that ends up in blkdev_fallocate(), which
> always translates both of those into blkdev_issue_zeroout(), not
> blkdev_issue_discard(). So it's really the zeroing capabilities of the
> block device that matters, even for loop discard operations. It seems
> weird, but I think this is the right thing because it presents a
> consistent interface to loop device users whether backed by a file
> system file, or directly by a block device. That is, a previously
> discarded range will read back as zeroes.

Ah, right.  Could you add this paragraph as a comment explaining why
we're setting max_discard_sectors from max_write_zeroes_sectors?

--D

> -Evan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v7 2/2] loop: Better discard support for block devices
  2019-11-20 19:13       ` Darrick J. Wong
@ 2019-11-20 19:25         ` Evan Green
  2019-11-20 19:45           ` Darrick J. Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Evan Green @ 2019-11-20 19:25 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Jens Axboe, Martin K Petersen, Gwendal Grignou,
	Christoph Hellwig, Ming Lei, Alexis Savery, Douglas Anderson,
	Bart Van Assche, Chaitanya Kulkarni, linux-block, LKML

On Wed, Nov 20, 2019 at 11:13 AM Darrick J. Wong
<darrick.wong@oracle.com> wrote:
>
> On Wed, Nov 20, 2019 at 10:56:30AM -0800, Evan Green wrote:
> > On Tue, Nov 19, 2019 at 6:25 PM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> > >
> > > On Thu, Nov 14, 2019 at 03:50:08PM -0800, Evan Green wrote:
> > > > If the backing device for a loop device is itself a block device,
> > > > then mirror the "write zeroes" capabilities of the underlying
> > > > block device into the loop device. Copy this capability into both
> > > > max_write_zeroes_sectors and max_discard_sectors of the loop device.
> > > >
> > > > The reason for this is that REQ_OP_DISCARD on a loop device translates
> > > > into blkdev_issue_zeroout(), rather than blkdev_issue_discard(). This
> > > > presents a consistent interface for loop devices (that discarded data
> > > > is zeroed), regardless of the backing device type of the loop device.
> > > > There should be no behavior change for loop devices backed by regular
> > > > files.

(marking this spot for below)

> > > >
> > > > This change fixes blktest block/003, and removes an extraneous
> > > > error print in block/013 when testing on a loop device backed
> > > > by a block device that does not support discard.
> > > >
> > > > Signed-off-by: Evan Green <evgreen@chromium.org>
> > > > Reviewed-by: Gwendal Grignou <gwendal@chromium.org>
> > > > Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> > > > ---
> > > >
> > > > Changes in v7:
> > > > - Rebase on top of Darrick's patch
> > > > - Tweak opening line of commit description (Darrick)
> > > >
> > > > Changes in v6: None
> > > > Changes in v5:
> > > > - Don't mirror discard if lo_encrypt_key_size is non-zero (Gwendal)
> > > >
> > > > Changes in v4:
> > > > - Mirror blkdev's write_zeroes into loopdev's discard_sectors.
> > > >
> > > > Changes in v3:
> > > > - Updated commit description
> > > >
> > > > Changes in v2: None
> > > >
> > > >  drivers/block/loop.c | 40 +++++++++++++++++++++++++++++-----------
> > > >  1 file changed, 29 insertions(+), 11 deletions(-)
> > > >
> > > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> > > > index 6a9fe1f9fe84..e8f23e4b78f7 100644
> > > > --- a/drivers/block/loop.c
> > > > +++ b/drivers/block/loop.c
> > > > @@ -427,11 +427,12 @@ static int lo_fallocate(struct loop_device *lo, struct request *rq, loff_t pos,
> > > >        * information.
> > > >        */
> > > >       struct file *file = lo->lo_backing_file;
> > > > +     struct request_queue *q = lo->lo_queue;
> > > >       int ret;
> > > >
> > > >       mode |= FALLOC_FL_KEEP_SIZE;
> > > >
> > > > -     if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) {
> > > > +     if (!blk_queue_discard(q)) {
> > > >               ret = -EOPNOTSUPP;
> > > >               goto out;
> > > >       }
> > > > @@ -862,6 +863,21 @@ static void loop_config_discard(struct loop_device *lo)
> > > >       struct file *file = lo->lo_backing_file;
> > > >       struct inode *inode = file->f_mapping->host;
> > > >       struct request_queue *q = lo->lo_queue;
> > > > +     struct request_queue *backingq;
> > > > +
> > > > +     /*
> > > > +      * If the backing device is a block device, mirror its zeroing
> > > > +      * capability. REQ_OP_DISCARD translates to a zero-out even when backed
> > > > +      * by block devices to keep consistent behavior with file-backed loop
> > > > +      * devices.
> > > > +      */
> > > > +     if (S_ISBLK(inode->i_mode) && !lo->lo_encrypt_key_size) {
> > > > +             backingq = bdev_get_queue(inode->i_bdev);
> > > > +             blk_queue_max_discard_sectors(q,
> > > > +                     backingq->limits.max_write_zeroes_sectors);
> > >
> > > max_discard_sectors?
> >
> > I didn't plumb max_discard_sectors because for my scenario it never
> > ends up hitting the block device that way.
> >
> > The loop device either uses FL_ZERO_RANGE or FL_PUNCH_HOLE. When
> > backed by a block device, that ends up in blkdev_fallocate(), which
> > always translates both of those into blkdev_issue_zeroout(), not
> > blkdev_issue_discard(). So it's really the zeroing capabilities of the
> > block device that matters, even for loop discard operations. It seems
> > weird, but I think this is the right thing because it presents a
> > consistent interface to loop device users whether backed by a file
> > system file, or directly by a block device. That is, a previously
> > discarded range will read back as zeroes.
>
> Ah, right.  Could you add this paragraph as a comment explaining why
> we're setting max_discard_sectors from max_write_zeroes_sectors?

Sure. I put an explanation in the commit description (see spot I
marked above), but I agree a comment is probably also worthwhile.

>
> --D
>
> > -Evan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v7 2/2] loop: Better discard support for block devices
  2019-11-20 19:25         ` Evan Green
@ 2019-11-20 19:45           ` Darrick J. Wong
  2019-11-21 21:18             ` Evan Green
  0 siblings, 1 reply; 13+ messages in thread
From: Darrick J. Wong @ 2019-11-20 19:45 UTC (permalink / raw)
  To: Evan Green
  Cc: Jens Axboe, Martin K Petersen, Gwendal Grignou,
	Christoph Hellwig, Ming Lei, Alexis Savery, Douglas Anderson,
	Bart Van Assche, Chaitanya Kulkarni, linux-block, LKML

On Wed, Nov 20, 2019 at 11:25:48AM -0800, Evan Green wrote:
> On Wed, Nov 20, 2019 at 11:13 AM Darrick J. Wong
> <darrick.wong@oracle.com> wrote:
> >
> > On Wed, Nov 20, 2019 at 10:56:30AM -0800, Evan Green wrote:
> > > On Tue, Nov 19, 2019 at 6:25 PM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> > > >
> > > > On Thu, Nov 14, 2019 at 03:50:08PM -0800, Evan Green wrote:
> > > > > If the backing device for a loop device is itself a block device,
> > > > > then mirror the "write zeroes" capabilities of the underlying
> > > > > block device into the loop device. Copy this capability into both
> > > > > max_write_zeroes_sectors and max_discard_sectors of the loop device.
> > > > >
> > > > > The reason for this is that REQ_OP_DISCARD on a loop device translates
> > > > > into blkdev_issue_zeroout(), rather than blkdev_issue_discard(). This
> > > > > presents a consistent interface for loop devices (that discarded data
> > > > > is zeroed), regardless of the backing device type of the loop device.
> > > > > There should be no behavior change for loop devices backed by regular
> > > > > files.
> 
> (marking this spot for below)
> 
> > > > >
> > > > > This change fixes blktest block/003, and removes an extraneous
> > > > > error print in block/013 when testing on a loop device backed
> > > > > by a block device that does not support discard.
> > > > >
> > > > > Signed-off-by: Evan Green <evgreen@chromium.org>
> > > > > Reviewed-by: Gwendal Grignou <gwendal@chromium.org>
> > > > > Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> > > > > ---
> > > > >
> > > > > Changes in v7:
> > > > > - Rebase on top of Darrick's patch
> > > > > - Tweak opening line of commit description (Darrick)
> > > > >
> > > > > Changes in v6: None
> > > > > Changes in v5:
> > > > > - Don't mirror discard if lo_encrypt_key_size is non-zero (Gwendal)
> > > > >
> > > > > Changes in v4:
> > > > > - Mirror blkdev's write_zeroes into loopdev's discard_sectors.
> > > > >
> > > > > Changes in v3:
> > > > > - Updated commit description
> > > > >
> > > > > Changes in v2: None
> > > > >
> > > > >  drivers/block/loop.c | 40 +++++++++++++++++++++++++++++-----------
> > > > >  1 file changed, 29 insertions(+), 11 deletions(-)
> > > > >
> > > > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> > > > > index 6a9fe1f9fe84..e8f23e4b78f7 100644
> > > > > --- a/drivers/block/loop.c
> > > > > +++ b/drivers/block/loop.c
> > > > > @@ -427,11 +427,12 @@ static int lo_fallocate(struct loop_device *lo, struct request *rq, loff_t pos,
> > > > >        * information.
> > > > >        */
> > > > >       struct file *file = lo->lo_backing_file;
> > > > > +     struct request_queue *q = lo->lo_queue;
> > > > >       int ret;
> > > > >
> > > > >       mode |= FALLOC_FL_KEEP_SIZE;
> > > > >
> > > > > -     if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) {
> > > > > +     if (!blk_queue_discard(q)) {
> > > > >               ret = -EOPNOTSUPP;
> > > > >               goto out;
> > > > >       }
> > > > > @@ -862,6 +863,21 @@ static void loop_config_discard(struct loop_device *lo)
> > > > >       struct file *file = lo->lo_backing_file;
> > > > >       struct inode *inode = file->f_mapping->host;
> > > > >       struct request_queue *q = lo->lo_queue;
> > > > > +     struct request_queue *backingq;
> > > > > +
> > > > > +     /*
> > > > > +      * If the backing device is a block device, mirror its zeroing
> > > > > +      * capability. REQ_OP_DISCARD translates to a zero-out even when backed
> > > > > +      * by block devices to keep consistent behavior with file-backed loop
> > > > > +      * devices.
> > > > > +      */
> > > > > +     if (S_ISBLK(inode->i_mode) && !lo->lo_encrypt_key_size) {
> > > > > +             backingq = bdev_get_queue(inode->i_bdev);
> > > > > +             blk_queue_max_discard_sectors(q,
> > > > > +                     backingq->limits.max_write_zeroes_sectors);
> > > >
> > > > max_discard_sectors?
> > >
> > > I didn't plumb max_discard_sectors because for my scenario it never
> > > ends up hitting the block device that way.
> > >
> > > The loop device either uses FL_ZERO_RANGE or FL_PUNCH_HOLE. When
> > > backed by a block device, that ends up in blkdev_fallocate(), which
> > > always translates both of those into blkdev_issue_zeroout(), not
> > > blkdev_issue_discard(). So it's really the zeroing capabilities of the
> > > block device that matters, even for loop discard operations. It seems
> > > weird, but I think this is the right thing because it presents a
> > > consistent interface to loop device users whether backed by a file
> > > system file, or directly by a block device. That is, a previously
> > > discarded range will read back as zeroes.
> >
> > Ah, right.  Could you add this paragraph as a comment explaining why
> > we're setting max_discard_sectors from max_write_zeroes_sectors?
> 
> Sure. I put an explanation in the commit description (see spot I
> marked above), but I agree a comment is probably also worthwhile.

<nod> Sorry about the churn here.

I have a strong preference towards documenting decisions like these
directly in the code because (a) I suck at reading patch prologues, (b)
someone reading the code after this gets committed will see it
immediately and right next to the relevant code, and (c) spelunking
through the git history of a file for commit messages is kind of clunky.

Dunno if that's just my age showing (mmm, pre-bk linux) or what. :/

--D

> >
> > --D
> >
> > > -Evan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v7 2/2] loop: Better discard support for block devices
  2019-11-20 19:45           ` Darrick J. Wong
@ 2019-11-21 21:18             ` Evan Green
  2019-11-21 21:25               ` Darrick J. Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Evan Green @ 2019-11-21 21:18 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Jens Axboe, Martin K Petersen, Gwendal Grignou,
	Christoph Hellwig, Ming Lei, Alexis Savery, Douglas Anderson,
	Bart Van Assche, Chaitanya Kulkarni, linux-block, LKML

On Wed, Nov 20, 2019 at 11:45 AM Darrick J. Wong
<darrick.wong@oracle.com> wrote:
>
> On Wed, Nov 20, 2019 at 11:25:48AM -0800, Evan Green wrote:
> > On Wed, Nov 20, 2019 at 11:13 AM Darrick J. Wong
> > <darrick.wong@oracle.com> wrote:
> > >
> > > On Wed, Nov 20, 2019 at 10:56:30AM -0800, Evan Green wrote:
> > > > On Tue, Nov 19, 2019 at 6:25 PM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> > > > >
> > > > > On Thu, Nov 14, 2019 at 03:50:08PM -0800, Evan Green wrote:
> > > > > > If the backing device for a loop device is itself a block device,
> > > > > > then mirror the "write zeroes" capabilities of the underlying
> > > > > > block device into the loop device. Copy this capability into both
> > > > > > max_write_zeroes_sectors and max_discard_sectors of the loop device.
> > > > > >
> > > > > > The reason for this is that REQ_OP_DISCARD on a loop device translates
> > > > > > into blkdev_issue_zeroout(), rather than blkdev_issue_discard(). This
> > > > > > presents a consistent interface for loop devices (that discarded data
> > > > > > is zeroed), regardless of the backing device type of the loop device.
> > > > > > There should be no behavior change for loop devices backed by regular
> > > > > > files.
> >
> > (marking this spot for below)
> >
> > > > > >
> > > > > > This change fixes blktest block/003, and removes an extraneous
> > > > > > error print in block/013 when testing on a loop device backed
> > > > > > by a block device that does not support discard.
> > > > > >
> > > > > > Signed-off-by: Evan Green <evgreen@chromium.org>
> > > > > > Reviewed-by: Gwendal Grignou <gwendal@chromium.org>
> > > > > > Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> > > > > > ---
> > > > > >
> > > > > > Changes in v7:
> > > > > > - Rebase on top of Darrick's patch
> > > > > > - Tweak opening line of commit description (Darrick)
> > > > > >
> > > > > > Changes in v6: None
> > > > > > Changes in v5:
> > > > > > - Don't mirror discard if lo_encrypt_key_size is non-zero (Gwendal)
> > > > > >
> > > > > > Changes in v4:
> > > > > > - Mirror blkdev's write_zeroes into loopdev's discard_sectors.
> > > > > >
> > > > > > Changes in v3:
> > > > > > - Updated commit description
> > > > > >
> > > > > > Changes in v2: None
> > > > > >
> > > > > >  drivers/block/loop.c | 40 +++++++++++++++++++++++++++++-----------
> > > > > >  1 file changed, 29 insertions(+), 11 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> > > > > > index 6a9fe1f9fe84..e8f23e4b78f7 100644
> > > > > > --- a/drivers/block/loop.c
> > > > > > +++ b/drivers/block/loop.c
> > > > > > @@ -427,11 +427,12 @@ static int lo_fallocate(struct loop_device *lo, struct request *rq, loff_t pos,
> > > > > >        * information.
> > > > > >        */
> > > > > >       struct file *file = lo->lo_backing_file;
> > > > > > +     struct request_queue *q = lo->lo_queue;
> > > > > >       int ret;
> > > > > >
> > > > > >       mode |= FALLOC_FL_KEEP_SIZE;
> > > > > >
> > > > > > -     if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) {
> > > > > > +     if (!blk_queue_discard(q)) {
> > > > > >               ret = -EOPNOTSUPP;
> > > > > >               goto out;
> > > > > >       }
> > > > > > @@ -862,6 +863,21 @@ static void loop_config_discard(struct loop_device *lo)
> > > > > >       struct file *file = lo->lo_backing_file;
> > > > > >       struct inode *inode = file->f_mapping->host;
> > > > > >       struct request_queue *q = lo->lo_queue;
> > > > > > +     struct request_queue *backingq;
> > > > > > +
> > > > > > +     /*
> > > > > > +      * If the backing device is a block device, mirror its zeroing
> > > > > > +      * capability. REQ_OP_DISCARD translates to a zero-out even when backed
> > > > > > +      * by block devices to keep consistent behavior with file-backed loop
> > > > > > +      * devices.
> > > > > > +      */

Wait, I went to make this change and realized there's already a comment here.

I can tweak the language a bit, but this is pretty much what you wanted, right?

> > > > > > +     if (S_ISBLK(inode->i_mode) && !lo->lo_encrypt_key_size) {
> > > > > > +             backingq = bdev_get_queue(inode->i_bdev);
> > > > > > +             blk_queue_max_discard_sectors(q,
> > > > > > +                     backingq->limits.max_write_zeroes_sectors);
> > > > >
> > > > > max_discard_sectors?
> > > >
> > > > I didn't plumb max_discard_sectors because for my scenario it never
> > > > ends up hitting the block device that way.
> > > >
> > > > The loop device either uses FL_ZERO_RANGE or FL_PUNCH_HOLE. When
> > > > backed by a block device, that ends up in blkdev_fallocate(), which
> > > > always translates both of those into blkdev_issue_zeroout(), not
> > > > blkdev_issue_discard(). So it's really the zeroing capabilities of the
> > > > block device that matters, even for loop discard operations. It seems
> > > > weird, but I think this is the right thing because it presents a
> > > > consistent interface to loop device users whether backed by a file
> > > > system file, or directly by a block device. That is, a previously
> > > > discarded range will read back as zeroes.
> > >
> > > Ah, right.  Could you add this paragraph as a comment explaining why
> > > we're setting max_discard_sectors from max_write_zeroes_sectors?
> >
> > Sure. I put an explanation in the commit description (see spot I
> > marked above), but I agree a comment is probably also worthwhile.
>
> <nod> Sorry about the churn here.
>
> I have a strong preference towards documenting decisions like these
> directly in the code because (a) I suck at reading patch prologues, (b)
> someone reading the code after this gets committed will see it
> immediately and right next to the relevant code, and (c) spelunking
> through the git history of a file for commit messages is kind of clunky.
>
> Dunno if that's just my age showing (mmm, pre-bk linux) or what. :/
>
> --D
>
> > >
> > > --D
> > >
> > > > -Evan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v7 2/2] loop: Better discard support for block devices
  2019-11-21 21:18             ` Evan Green
@ 2019-11-21 21:25               ` Darrick J. Wong
  2019-11-21 22:35                 ` Evan Green
  0 siblings, 1 reply; 13+ messages in thread
From: Darrick J. Wong @ 2019-11-21 21:25 UTC (permalink / raw)
  To: Evan Green
  Cc: Jens Axboe, Martin K Petersen, Gwendal Grignou,
	Christoph Hellwig, Ming Lei, Alexis Savery, Douglas Anderson,
	Bart Van Assche, Chaitanya Kulkarni, linux-block, LKML

On Thu, Nov 21, 2019 at 01:18:51PM -0800, Evan Green wrote:
> On Wed, Nov 20, 2019 at 11:45 AM Darrick J. Wong
> <darrick.wong@oracle.com> wrote:
> >
> > On Wed, Nov 20, 2019 at 11:25:48AM -0800, Evan Green wrote:
> > > On Wed, Nov 20, 2019 at 11:13 AM Darrick J. Wong
> > > <darrick.wong@oracle.com> wrote:
> > > >
> > > > On Wed, Nov 20, 2019 at 10:56:30AM -0800, Evan Green wrote:
> > > > > On Tue, Nov 19, 2019 at 6:25 PM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> > > > > >
> > > > > > On Thu, Nov 14, 2019 at 03:50:08PM -0800, Evan Green wrote:
> > > > > > > If the backing device for a loop device is itself a block device,
> > > > > > > then mirror the "write zeroes" capabilities of the underlying
> > > > > > > block device into the loop device. Copy this capability into both
> > > > > > > max_write_zeroes_sectors and max_discard_sectors of the loop device.
> > > > > > >
> > > > > > > The reason for this is that REQ_OP_DISCARD on a loop device translates
> > > > > > > into blkdev_issue_zeroout(), rather than blkdev_issue_discard(). This
> > > > > > > presents a consistent interface for loop devices (that discarded data
> > > > > > > is zeroed), regardless of the backing device type of the loop device.
> > > > > > > There should be no behavior change for loop devices backed by regular
> > > > > > > files.
> > >
> > > (marking this spot for below)
> > >
> > > > > > >
> > > > > > > This change fixes blktest block/003, and removes an extraneous
> > > > > > > error print in block/013 when testing on a loop device backed
> > > > > > > by a block device that does not support discard.
> > > > > > >
> > > > > > > Signed-off-by: Evan Green <evgreen@chromium.org>
> > > > > > > Reviewed-by: Gwendal Grignou <gwendal@chromium.org>
> > > > > > > Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> > > > > > > ---
> > > > > > >
> > > > > > > Changes in v7:
> > > > > > > - Rebase on top of Darrick's patch
> > > > > > > - Tweak opening line of commit description (Darrick)
> > > > > > >
> > > > > > > Changes in v6: None
> > > > > > > Changes in v5:
> > > > > > > - Don't mirror discard if lo_encrypt_key_size is non-zero (Gwendal)
> > > > > > >
> > > > > > > Changes in v4:
> > > > > > > - Mirror blkdev's write_zeroes into loopdev's discard_sectors.
> > > > > > >
> > > > > > > Changes in v3:
> > > > > > > - Updated commit description
> > > > > > >
> > > > > > > Changes in v2: None
> > > > > > >
> > > > > > >  drivers/block/loop.c | 40 +++++++++++++++++++++++++++++-----------
> > > > > > >  1 file changed, 29 insertions(+), 11 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> > > > > > > index 6a9fe1f9fe84..e8f23e4b78f7 100644
> > > > > > > --- a/drivers/block/loop.c
> > > > > > > +++ b/drivers/block/loop.c
> > > > > > > @@ -427,11 +427,12 @@ static int lo_fallocate(struct loop_device *lo, struct request *rq, loff_t pos,
> > > > > > >        * information.
> > > > > > >        */
> > > > > > >       struct file *file = lo->lo_backing_file;
> > > > > > > +     struct request_queue *q = lo->lo_queue;
> > > > > > >       int ret;
> > > > > > >
> > > > > > >       mode |= FALLOC_FL_KEEP_SIZE;
> > > > > > >
> > > > > > > -     if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) {
> > > > > > > +     if (!blk_queue_discard(q)) {
> > > > > > >               ret = -EOPNOTSUPP;
> > > > > > >               goto out;
> > > > > > >       }
> > > > > > > @@ -862,6 +863,21 @@ static void loop_config_discard(struct loop_device *lo)
> > > > > > >       struct file *file = lo->lo_backing_file;
> > > > > > >       struct inode *inode = file->f_mapping->host;
> > > > > > >       struct request_queue *q = lo->lo_queue;
> > > > > > > +     struct request_queue *backingq;
> > > > > > > +
> > > > > > > +     /*
> > > > > > > +      * If the backing device is a block device, mirror its zeroing
> > > > > > > +      * capability. REQ_OP_DISCARD translates to a zero-out even when backed
> > > > > > > +      * by block devices to keep consistent behavior with file-backed loop
> > > > > > > +      * devices.
> > > > > > > +      */
> 
> Wait, I went to make this change and realized there's already a comment here.
> 
> I can tweak the language a bit, but this is pretty much what you wanted, right?

Yep.

--D

> > > > > > > +     if (S_ISBLK(inode->i_mode) && !lo->lo_encrypt_key_size) {
> > > > > > > +             backingq = bdev_get_queue(inode->i_bdev);
> > > > > > > +             blk_queue_max_discard_sectors(q,
> > > > > > > +                     backingq->limits.max_write_zeroes_sectors);
> > > > > >
> > > > > > max_discard_sectors?
> > > > >
> > > > > I didn't plumb max_discard_sectors because for my scenario it never
> > > > > ends up hitting the block device that way.
> > > > >
> > > > > The loop device either uses FL_ZERO_RANGE or FL_PUNCH_HOLE. When
> > > > > backed by a block device, that ends up in blkdev_fallocate(), which
> > > > > always translates both of those into blkdev_issue_zeroout(), not
> > > > > blkdev_issue_discard(). So it's really the zeroing capabilities of the
> > > > > block device that matters, even for loop discard operations. It seems
> > > > > weird, but I think this is the right thing because it presents a
> > > > > consistent interface to loop device users whether backed by a file
> > > > > system file, or directly by a block device. That is, a previously
> > > > > discarded range will read back as zeroes.
> > > >
> > > > Ah, right.  Could you add this paragraph as a comment explaining why
> > > > we're setting max_discard_sectors from max_write_zeroes_sectors?
> > >
> > > Sure. I put an explanation in the commit description (see spot I
> > > marked above), but I agree a comment is probably also worthwhile.
> >
> > <nod> Sorry about the churn here.
> >
> > I have a strong preference towards documenting decisions like these
> > directly in the code because (a) I suck at reading patch prologues, (b)
> > someone reading the code after this gets committed will see it
> > immediately and right next to the relevant code, and (c) spelunking
> > through the git history of a file for commit messages is kind of clunky.
> >
> > Dunno if that's just my age showing (mmm, pre-bk linux) or what. :/
> >
> > --D
> >
> > > >
> > > > --D
> > > >
> > > > > -Evan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v7 2/2] loop: Better discard support for block devices
  2019-11-21 21:25               ` Darrick J. Wong
@ 2019-11-21 22:35                 ` Evan Green
  0 siblings, 0 replies; 13+ messages in thread
From: Evan Green @ 2019-11-21 22:35 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Jens Axboe, Martin K Petersen, Gwendal Grignou,
	Christoph Hellwig, Ming Lei, Alexis Savery, Douglas Anderson,
	Bart Van Assche, Chaitanya Kulkarni, linux-block, LKML

On Thu, Nov 21, 2019 at 1:25 PM Darrick J. Wong <darrick.wong@oracle.com> wrote:
>
> On Thu, Nov 21, 2019 at 01:18:51PM -0800, Evan Green wrote:
> > On Wed, Nov 20, 2019 at 11:45 AM Darrick J. Wong
> > <darrick.wong@oracle.com> wrote:
> > >
> > > On Wed, Nov 20, 2019 at 11:25:48AM -0800, Evan Green wrote:
> > > > On Wed, Nov 20, 2019 at 11:13 AM Darrick J. Wong
> > > > <darrick.wong@oracle.com> wrote:
> > > > >
> > > > > On Wed, Nov 20, 2019 at 10:56:30AM -0800, Evan Green wrote:
> > > > > > On Tue, Nov 19, 2019 at 6:25 PM Darrick J. Wong <darrick.wong@oracle.com> wrote:
> > > > > > >
> > > > > > > On Thu, Nov 14, 2019 at 03:50:08PM -0800, Evan Green wrote:
> > > > > > > > If the backing device for a loop device is itself a block device,
> > > > > > > > then mirror the "write zeroes" capabilities of the underlying
> > > > > > > > block device into the loop device. Copy this capability into both
> > > > > > > > max_write_zeroes_sectors and max_discard_sectors of the loop device.
> > > > > > > >
> > > > > > > > The reason for this is that REQ_OP_DISCARD on a loop device translates
> > > > > > > > into blkdev_issue_zeroout(), rather than blkdev_issue_discard(). This
> > > > > > > > presents a consistent interface for loop devices (that discarded data
> > > > > > > > is zeroed), regardless of the backing device type of the loop device.
> > > > > > > > There should be no behavior change for loop devices backed by regular
> > > > > > > > files.
> > > >
> > > > (marking this spot for below)
> > > >
> > > > > > > >
> > > > > > > > This change fixes blktest block/003, and removes an extraneous
> > > > > > > > error print in block/013 when testing on a loop device backed
> > > > > > > > by a block device that does not support discard.
> > > > > > > >
> > > > > > > > Signed-off-by: Evan Green <evgreen@chromium.org>
> > > > > > > > Reviewed-by: Gwendal Grignou <gwendal@chromium.org>
> > > > > > > > Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
> > > > > > > > ---
> > > > > > > >
> > > > > > > > Changes in v7:
> > > > > > > > - Rebase on top of Darrick's patch
> > > > > > > > - Tweak opening line of commit description (Darrick)
> > > > > > > >
> > > > > > > > Changes in v6: None
> > > > > > > > Changes in v5:
> > > > > > > > - Don't mirror discard if lo_encrypt_key_size is non-zero (Gwendal)
> > > > > > > >
> > > > > > > > Changes in v4:
> > > > > > > > - Mirror blkdev's write_zeroes into loopdev's discard_sectors.
> > > > > > > >
> > > > > > > > Changes in v3:
> > > > > > > > - Updated commit description
> > > > > > > >
> > > > > > > > Changes in v2: None
> > > > > > > >
> > > > > > > >  drivers/block/loop.c | 40 +++++++++++++++++++++++++++++-----------
> > > > > > > >  1 file changed, 29 insertions(+), 11 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> > > > > > > > index 6a9fe1f9fe84..e8f23e4b78f7 100644
> > > > > > > > --- a/drivers/block/loop.c
> > > > > > > > +++ b/drivers/block/loop.c
> > > > > > > > @@ -427,11 +427,12 @@ static int lo_fallocate(struct loop_device *lo, struct request *rq, loff_t pos,
> > > > > > > >        * information.
> > > > > > > >        */
> > > > > > > >       struct file *file = lo->lo_backing_file;
> > > > > > > > +     struct request_queue *q = lo->lo_queue;
> > > > > > > >       int ret;
> > > > > > > >
> > > > > > > >       mode |= FALLOC_FL_KEEP_SIZE;
> > > > > > > >
> > > > > > > > -     if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) {
> > > > > > > > +     if (!blk_queue_discard(q)) {
> > > > > > > >               ret = -EOPNOTSUPP;
> > > > > > > >               goto out;
> > > > > > > >       }
> > > > > > > > @@ -862,6 +863,21 @@ static void loop_config_discard(struct loop_device *lo)
> > > > > > > >       struct file *file = lo->lo_backing_file;
> > > > > > > >       struct inode *inode = file->f_mapping->host;
> > > > > > > >       struct request_queue *q = lo->lo_queue;
> > > > > > > > +     struct request_queue *backingq;
> > > > > > > > +
> > > > > > > > +     /*
> > > > > > > > +      * If the backing device is a block device, mirror its zeroing
> > > > > > > > +      * capability. REQ_OP_DISCARD translates to a zero-out even when backed
> > > > > > > > +      * by block devices to keep consistent behavior with file-backed loop
> > > > > > > > +      * devices.
> > > > > > > > +      */
> >
> > Wait, I went to make this change and realized there's already a comment here.
> >
> > I can tweak the language a bit, but this is pretty much what you wanted, right?
>
> Yep.
>

Jens, any opinions? I'm happy to spin one more time to clarify the
comment as follows if desired, or leave it as-is too!

        /*
         * If the backing device is a block device, mirror its zeroing
         * capability. Set the discard sectors to the block device's zeroing
         * capabilities because loop discards result in blkdev_issue_zeroout(),
         * not blkdev_issue_discard(). This maintains consistent behavior with
         * file-backed loop devices: discarded regions read back as zero.
         */

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v7 1/2] loop: Report EOPNOTSUPP properly
  2019-11-14 23:50 ` [PATCH v7 1/2] loop: Report EOPNOTSUPP properly Evan Green
@ 2019-12-02 17:05   ` Gwendal Grignou
  2019-12-03  0:48   ` Bart Van Assche
  1 sibling, 0 replies; 13+ messages in thread
From: Gwendal Grignou @ 2019-12-02 17:05 UTC (permalink / raw)
  To: Evan Green
  Cc: Jens Axboe, Martin K Petersen, Christoph Hellwig, Ming Lei,
	Darrick J . Wong, Alexis Savery, Douglas Anderson,
	Bart Van Assche, linux-block, linux-kernel

 Reviewed-by: Gwendal Grignou <gwendal@chromium.org>

On Thu, Nov 14, 2019 at 3:50 PM Evan Green <evgreen@chromium.org> wrote:
>
> Properly plumb out EOPNOTSUPP from loop driver operations, which may
> get returned when for instance a discard operation is attempted but not
> supported by the underlying block device. Before this change, everything
> was reported in the log as an I/O error, which is scary and not
> helpful in debugging.
>
> Signed-off-by: Evan Green <evgreen@chromium.org>
> ---
>
> Changes in v7:
> - Use errno_to_blk_status() (Christoph)
>
> Changes in v6:
> - Updated tags
>
> Changes in v5: None
> Changes in v4: None
> Changes in v3:
> - Updated tags
>
> Changes in v2:
> - Unnested error if statement (Bart)
>
>  drivers/block/loop.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index ef6e251857c8..6a9fe1f9fe84 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -461,7 +461,7 @@ static void lo_complete_rq(struct request *rq)
>         if (!cmd->use_aio || cmd->ret < 0 || cmd->ret == blk_rq_bytes(rq) ||
>             req_op(rq) != REQ_OP_READ) {
>                 if (cmd->ret < 0)
> -                       ret = BLK_STS_IOERR;
> +                       ret = errno_to_blk_status(cmd->ret);
>                 goto end_io;
>         }
>
> @@ -1950,7 +1950,10 @@ static void loop_handle_cmd(struct loop_cmd *cmd)
>   failed:
>         /* complete non-aio request */
>         if (!cmd->use_aio || ret) {
> -               cmd->ret = ret ? -EIO : 0;
> +               if (ret == -EOPNOTSUPP)
> +                       cmd->ret = ret;
> +               else
> +                       cmd->ret = ret ? -EIO : 0;
>                 blk_mq_complete_request(rq);
>         }
>  }
> --
> 2.21.0
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v7 1/2] loop: Report EOPNOTSUPP properly
  2019-11-14 23:50 ` [PATCH v7 1/2] loop: Report EOPNOTSUPP properly Evan Green
  2019-12-02 17:05   ` Gwendal Grignou
@ 2019-12-03  0:48   ` Bart Van Assche
  1 sibling, 0 replies; 13+ messages in thread
From: Bart Van Assche @ 2019-12-03  0:48 UTC (permalink / raw)
  To: Evan Green, Jens Axboe, Martin K Petersen
  Cc: Gwendal Grignou, Christoph Hellwig, Ming Lei, Darrick J . Wong,
	Alexis Savery, Douglas Anderson, linux-block, linux-kernel

On 11/14/19 3:50 PM, Evan Green wrote:
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index ef6e251857c8..6a9fe1f9fe84 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -461,7 +461,7 @@ static void lo_complete_rq(struct request *rq)
>   	if (!cmd->use_aio || cmd->ret < 0 || cmd->ret == blk_rq_bytes(rq) ||
>   	    req_op(rq) != REQ_OP_READ) {
>   		if (cmd->ret < 0)
> -			ret = BLK_STS_IOERR;
> +			ret = errno_to_blk_status(cmd->ret);
>   		goto end_io;
>   	}
>   
> @@ -1950,7 +1950,10 @@ static void loop_handle_cmd(struct loop_cmd *cmd)
>    failed:
>   	/* complete non-aio request */
>   	if (!cmd->use_aio || ret) {
> -		cmd->ret = ret ? -EIO : 0;
> +		if (ret == -EOPNOTSUPP)
> +			cmd->ret = ret;
> +		else
> +			cmd->ret = ret ? -EIO : 0;
>   		blk_mq_complete_request(rq);
>   	}
>   }

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-12-03  0:49 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-14 23:50 [PATCH v7 0/2] loop: Better discard for block devices Evan Green
2019-11-14 23:50 ` [PATCH v7 1/2] loop: Report EOPNOTSUPP properly Evan Green
2019-12-02 17:05   ` Gwendal Grignou
2019-12-03  0:48   ` Bart Van Assche
2019-11-14 23:50 ` [PATCH v7 2/2] loop: Better discard support for block devices Evan Green
2019-11-20  2:25   ` Darrick J. Wong
2019-11-20 18:56     ` Evan Green
2019-11-20 19:13       ` Darrick J. Wong
2019-11-20 19:25         ` Evan Green
2019-11-20 19:45           ` Darrick J. Wong
2019-11-21 21:18             ` Evan Green
2019-11-21 21:25               ` Darrick J. Wong
2019-11-21 22:35                 ` Evan Green

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).