* [PATCH] loop: fix no-unmap write-zeroes request behavior
@ 2019-10-10 17:02 Darrick J. Wong
2019-10-11 7:51 ` Christoph Hellwig
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Darrick J. Wong @ 2019-10-10 17:02 UTC (permalink / raw)
To: Jens Axboe, Christoph Hellwig; +Cc: linux-block, linux-fsdevel, xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Currently, if the loop device receives a WRITE_ZEROES request, it asks
the underlying filesystem to punch out the range. This behavior is
correct if unmapping is allowed. However, a NOUNMAP request means that
the caller forbids us from freeing the storage backing the range, so
punching out the range is incorrect behavior.
To satisfy a NOUNMAP | WRITE_ZEROES request, loop should ask the
underlying filesystem to FALLOC_FL_ZERO_RANGE, which is (according to
the fallocate documentation) required to ensure that the entire range is
backed by real storage, which suffices for our purposes.
Fixes: 19372e2769179dd ("loop: implement REQ_OP_WRITE_ZEROES")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
drivers/block/loop.c | 32 +++++++++++++++++++++++++++++++-
1 file changed, 31 insertions(+), 1 deletion(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index f6f77eaa7217..0dc981e94bf0 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -441,6 +441,35 @@ static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos)
return ret;
}
+static int lo_zeroout(struct loop_device *lo, struct request *rq, loff_t pos)
+{
+ struct file *file = lo->lo_backing_file;
+ int mode = FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE;
+ int ret;
+
+ /* If we're allowed to unmap the blocks, ask the fs to punch them. */
+ if (!(rq->cmd_flags & REQ_NOUNMAP)) {
+ ret = lo_discard(lo, rq, pos);
+ if (!ret)
+ return 0;
+ }
+
+ /*
+ * Otherwise, ask the fs to zero out the blocks, which will result in
+ * space being allocated to the file.
+ */
+ if (!file->f_op->fallocate) {
+ ret = -EOPNOTSUPP;
+ goto out;
+ }
+
+ ret = file->f_op->fallocate(file, mode, pos, blk_rq_bytes(rq));
+ if (unlikely(ret && ret != -EINVAL && ret != -EOPNOTSUPP))
+ ret = -EIO;
+ out:
+ return ret;
+}
+
static int lo_req_flush(struct loop_device *lo, struct request *rq)
{
struct file *file = lo->lo_backing_file;
@@ -597,8 +626,9 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq)
case REQ_OP_FLUSH:
return lo_req_flush(lo, rq);
case REQ_OP_DISCARD:
- case REQ_OP_WRITE_ZEROES:
return lo_discard(lo, rq, pos);
+ case REQ_OP_WRITE_ZEROES:
+ return lo_zeroout(lo, rq, pos);
case REQ_OP_WRITE:
if (lo->transfer)
return lo_write_transfer(lo, rq, pos);
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] loop: fix no-unmap write-zeroes request behavior
2019-10-10 17:02 [PATCH] loop: fix no-unmap write-zeroes request behavior Darrick J. Wong
@ 2019-10-11 7:51 ` Christoph Hellwig
2019-10-11 16:05 ` [PATCH v2] " Darrick J. Wong
2019-10-14 15:50 ` [PATCH v3] " Darrick J. Wong
2 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2019-10-11 7:51 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Jens Axboe, Christoph Hellwig, linux-block, linux-fsdevel, xfs
On Thu, Oct 10, 2019 at 10:02:39AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> Currently, if the loop device receives a WRITE_ZEROES request, it asks
> the underlying filesystem to punch out the range. This behavior is
> correct if unmapping is allowed. However, a NOUNMAP request means that
> the caller forbids us from freeing the storage backing the range, so
> punching out the range is incorrect behavior.
It doesn't really forbid, as most protocols don't have a way for forbid
deallocation. It requests not to.
Otherwise this looks fine, although I would have implemented it slightly
differently:
> case REQ_OP_FLUSH:
> return lo_req_flush(lo, rq);
> case REQ_OP_DISCARD:
> - case REQ_OP_WRITE_ZEROES:
> return lo_discard(lo, rq, pos);
> + case REQ_OP_WRITE_ZEROES:
> + return lo_zeroout(lo, rq, pos);
This could just become:
case REQ_OP_WRITE_ZEROES:
if (rq->cmd_flags & REQ_NOUNMAP))
return lo_zeroout(lo, rq, pos);
/*FALLTHRU*/
case REQ_OP_DISCARD:
return lo_discard(lo, rq, pos);
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2] loop: fix no-unmap write-zeroes request behavior
2019-10-10 17:02 [PATCH] loop: fix no-unmap write-zeroes request behavior Darrick J. Wong
2019-10-11 7:51 ` Christoph Hellwig
@ 2019-10-11 16:05 ` Darrick J. Wong
2019-10-14 7:28 ` Christoph Hellwig
2019-10-14 15:50 ` [PATCH v3] " Darrick J. Wong
2 siblings, 1 reply; 8+ messages in thread
From: Darrick J. Wong @ 2019-10-11 16:05 UTC (permalink / raw)
To: Jens Axboe, Christoph Hellwig; +Cc: linux-block, linux-fsdevel, xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Currently, if the loop device receives a WRITE_ZEROES request, it asks
the underlying filesystem to punch out the range. This behavior is
correct if unmapping is allowed. However, a NOUNMAP request means that
the caller forbids us from freeing the storage backing the range, so
punching out the range is incorrect behavior.
To satisfy a NOUNMAP | WRITE_ZEROES request, loop should ask the
underlying filesystem to FALLOC_FL_ZERO_RANGE, which is (according to
the fallocate documentation) required to ensure that the entire range is
backed by real storage, which suffices for our purposes.
Fixes: 19372e2769179dd ("loop: implement REQ_OP_WRITE_ZEROES")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
v2: reorganize a little according to hch feedback
---
drivers/block/loop.c | 31 ++++++++++++++++++++++++++++++-
1 file changed, 30 insertions(+), 1 deletion(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index f6f77eaa7217..4943d0c5c61c 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -441,6 +441,28 @@ static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos)
return ret;
}
+static int lo_zeroout(struct loop_device *lo, struct request *rq, loff_t pos)
+{
+ struct file *file = lo->lo_backing_file;
+ int mode = FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE;
+ int ret;
+
+ /*
+ * Ask the fs to zero out the blocks, which is supposed to result in
+ * space being allocated to the file.
+ */
+ if (!file->f_op->fallocate) {
+ ret = -EOPNOTSUPP;
+ goto out;
+ }
+
+ ret = file->f_op->fallocate(file, mode, pos, blk_rq_bytes(rq));
+ if (unlikely(ret && ret != -EINVAL && ret != -EOPNOTSUPP))
+ ret = -EIO;
+ out:
+ return ret;
+}
+
static int lo_req_flush(struct loop_device *lo, struct request *rq)
{
struct file *file = lo->lo_backing_file;
@@ -596,8 +618,15 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq)
switch (req_op(rq)) {
case REQ_OP_FLUSH:
return lo_req_flush(lo, rq);
- case REQ_OP_DISCARD:
case REQ_OP_WRITE_ZEROES:
+ /*
+ * If the caller doesn't want deallocation, call zeroout to
+ * write zeroes the range. Otherwise, punch them out.
+ */
+ if (rq->cmd_flags & REQ_NOUNMAP)
+ return lo_zeroout(lo, rq, pos);
+ /* fall through */
+ case REQ_OP_DISCARD:
return lo_discard(lo, rq, pos);
case REQ_OP_WRITE:
if (lo->transfer)
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2] loop: fix no-unmap write-zeroes request behavior
2019-10-11 16:05 ` [PATCH v2] " Darrick J. Wong
@ 2019-10-14 7:28 ` Christoph Hellwig
0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2019-10-14 7:28 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Jens Axboe, Christoph Hellwig, linux-block, linux-fsdevel, xfs
While this looks generally good to me, I have another nitpick to avoid
code duplication. What about just renaming lo_discard to lo_fallocate
and pass the mode (possibly minus the FALLOC_FL_KEEP_SIZE flag) to it?
The in the do_req_filebacked we could further simplify it down to:
case REQ_OP_WRITE_ZEROES:
/*
* If the caller doesn't want deallocation, call zeroout to
* write zeroes the range. Otherwise, punch them out.
*/
return lo_fallocate(lo, rq, pos,
(rq->cmd_flags & REQ_NOUNMAP) ?
FALLOC_FL_ZERO_RANGE : FALLOC_FL_PUNCH_HOLE);
break;
case REQ_OP_DISCARD:
return lo_fallocate(lo, rq, pos, FALLOC_FL_PUNCH_HOLE);
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v3] loop: fix no-unmap write-zeroes request behavior
2019-10-10 17:02 [PATCH] loop: fix no-unmap write-zeroes request behavior Darrick J. Wong
2019-10-11 7:51 ` Christoph Hellwig
2019-10-11 16:05 ` [PATCH v2] " Darrick J. Wong
@ 2019-10-14 15:50 ` Darrick J. Wong
2019-10-14 16:39 ` Eric Sandeen
2019-10-15 7:58 ` Christoph Hellwig
2 siblings, 2 replies; 8+ messages in thread
From: Darrick J. Wong @ 2019-10-14 15:50 UTC (permalink / raw)
To: Jens Axboe, Christoph Hellwig; +Cc: linux-block, linux-fsdevel, xfs
From: Darrick J. Wong <darrick.wong@oracle.com>
Currently, if the loop device receives a WRITE_ZEROES request, it asks
the underlying filesystem to punch out the range. This behavior is
correct if unmapping is allowed. However, a NOUNMAP request means that
the caller doesn't want us to free the storage backing the range, so
punching out the range is incorrect behavior.
To satisfy a NOUNMAP | WRITE_ZEROES request, loop should ask the
underlying filesystem to FALLOC_FL_ZERO_RANGE, which is (according to
the fallocate documentation) required to ensure that the entire range is
backed by real storage, which suffices for our purposes.
Fixes: 19372e2769179dd ("loop: implement REQ_OP_WRITE_ZEROES")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
v3: refactor into a single fallocate function
v2: reorganize a little according to hch feedback
---
drivers/block/loop.c | 26 ++++++++++++++++++--------
1 file changed, 18 insertions(+), 8 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index f6f77eaa7217..ef6e251857c8 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -417,18 +417,20 @@ static int lo_read_transfer(struct loop_device *lo, struct request *rq,
return ret;
}
-static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos)
+static int lo_fallocate(struct loop_device *lo, struct request *rq, loff_t pos,
+ int mode)
{
/*
- * We use punch hole to reclaim the free space used by the
- * image a.k.a. discard. However we do not support discard if
- * encryption is enabled, because it may give an attacker
- * useful information.
+ * We use fallocate to manipulate the space mappings used by the image
+ * a.k.a. discard/zerorange. However we do not support this if
+ * encryption is enabled, because it may give an attacker useful
+ * information.
*/
struct file *file = lo->lo_backing_file;
- int mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE;
int ret;
+ mode |= FALLOC_FL_KEEP_SIZE;
+
if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) {
ret = -EOPNOTSUPP;
goto out;
@@ -596,9 +598,17 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq)
switch (req_op(rq)) {
case REQ_OP_FLUSH:
return lo_req_flush(lo, rq);
- case REQ_OP_DISCARD:
case REQ_OP_WRITE_ZEROES:
- return lo_discard(lo, rq, pos);
+ /*
+ * If the caller doesn't want deallocation, call zeroout to
+ * write zeroes the range. Otherwise, punch them out.
+ */
+ return lo_fallocate(lo, rq, pos,
+ (rq->cmd_flags & REQ_NOUNMAP) ?
+ FALLOC_FL_ZERO_RANGE :
+ FALLOC_FL_PUNCH_HOLE);
+ case REQ_OP_DISCARD:
+ return lo_fallocate(lo, rq, pos, FALLOC_FL_PUNCH_HOLE);
case REQ_OP_WRITE:
if (lo->transfer)
return lo_write_transfer(lo, rq, pos);
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v3] loop: fix no-unmap write-zeroes request behavior
2019-10-14 15:50 ` [PATCH v3] " Darrick J. Wong
@ 2019-10-14 16:39 ` Eric Sandeen
2019-10-14 17:00 ` Darrick J. Wong
2019-10-15 7:58 ` Christoph Hellwig
1 sibling, 1 reply; 8+ messages in thread
From: Eric Sandeen @ 2019-10-14 16:39 UTC (permalink / raw)
To: Darrick J. Wong, Jens Axboe, Christoph Hellwig
Cc: linux-block, linux-fsdevel, xfs
On 10/14/19 10:50 AM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> Currently, if the loop device receives a WRITE_ZEROES request, it asks
> the underlying filesystem to punch out the range. This behavior is
> correct if unmapping is allowed. However, a NOUNMAP request means that
> the caller doesn't want us to free the storage backing the range, so
> punching out the range is incorrect behavior.
>
> To satisfy a NOUNMAP | WRITE_ZEROES request, loop should ask the
> underlying filesystem to FALLOC_FL_ZERO_RANGE, which is (according to
> the fallocate documentation) required to ensure that the entire range is
> backed by real storage, which suffices for our purposes.
>
> Fixes: 19372e2769179dd ("loop: implement REQ_OP_WRITE_ZEROES")
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> v3: refactor into a single fallocate function
> v2: reorganize a little according to hch feedback
> ---
> drivers/block/loop.c | 26 ++++++++++++++++++--------
> 1 file changed, 18 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index f6f77eaa7217..ef6e251857c8 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -417,18 +417,20 @@ static int lo_read_transfer(struct loop_device *lo, struct request *rq,
> return ret;
> }
>
> -static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos)
> +static int lo_fallocate(struct loop_device *lo, struct request *rq, loff_t pos,
> + int mode)
> {
> /*
> - * We use punch hole to reclaim the free space used by the
> - * image a.k.a. discard. However we do not support discard if
> - * encryption is enabled, because it may give an attacker
> - * useful information.
> + * We use fallocate to manipulate the space mappings used by the image
> + * a.k.a. discard/zerorange. However we do not support this if
> + * encryption is enabled, because it may give an attacker useful
> + * information.
> */
> struct file *file = lo->lo_backing_file;
> - int mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE;
> int ret;
>
> + mode |= FALLOC_FL_KEEP_SIZE;
> +
> if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) {
> ret = -EOPNOTSUPP;
> goto out;
> @@ -596,9 +598,17 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq)
> switch (req_op(rq)) {
> case REQ_OP_FLUSH:
> return lo_req_flush(lo, rq);
> - case REQ_OP_DISCARD:
> case REQ_OP_WRITE_ZEROES:
> - return lo_discard(lo, rq, pos);
cxz ÿbvVBV
> + case REQ_OP_DISCARD:
> + return lo_fallocate(lo, rq, pos, FALLOC_FL_PUNCH_HOLE);
I get lost in the twisty passages. What happens if the filesystem hosting the
backing file doesn't support fallocate, and REQ_OP_DISCARD / REQ_OP_WRITE_ZEROES
returns EOPNOTSUPP - discard is advisory, is it ok to fail REQ_OP_WRITE_ZEROES?
Does something at another layer fall back to writing zeros?
-Eric
> case REQ_OP_WRITE:
> if (lo->transfer)
> return lo_write_transfer(lo, rq, pos);
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3] loop: fix no-unmap write-zeroes request behavior
2019-10-14 16:39 ` Eric Sandeen
@ 2019-10-14 17:00 ` Darrick J. Wong
0 siblings, 0 replies; 8+ messages in thread
From: Darrick J. Wong @ 2019-10-14 17:00 UTC (permalink / raw)
To: Eric Sandeen
Cc: Jens Axboe, Christoph Hellwig, linux-block, linux-fsdevel, xfs
On Mon, Oct 14, 2019 at 11:39:43AM -0500, Eric Sandeen wrote:
> On 10/14/19 10:50 AM, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> >
> > Currently, if the loop device receives a WRITE_ZEROES request, it asks
> > the underlying filesystem to punch out the range. This behavior is
> > correct if unmapping is allowed. However, a NOUNMAP request means that
> > the caller doesn't want us to free the storage backing the range, so
> > punching out the range is incorrect behavior.
> >
> > To satisfy a NOUNMAP | WRITE_ZEROES request, loop should ask the
> > underlying filesystem to FALLOC_FL_ZERO_RANGE, which is (according to
> > the fallocate documentation) required to ensure that the entire range is
> > backed by real storage, which suffices for our purposes.
> >
> > Fixes: 19372e2769179dd ("loop: implement REQ_OP_WRITE_ZEROES")
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> > v3: refactor into a single fallocate function
> > v2: reorganize a little according to hch feedback
> > ---
> > drivers/block/loop.c | 26 ++++++++++++++++++--------
> > 1 file changed, 18 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> > index f6f77eaa7217..ef6e251857c8 100644
> > --- a/drivers/block/loop.c
> > +++ b/drivers/block/loop.c
> > @@ -417,18 +417,20 @@ static int lo_read_transfer(struct loop_device *lo, struct request *rq,
> > return ret;
> > }
> > -static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos)
> > +static int lo_fallocate(struct loop_device *lo, struct request *rq, loff_t pos,
> > + int mode)
> > {
> > /*
> > - * We use punch hole to reclaim the free space used by the
> > - * image a.k.a. discard. However we do not support discard if
> > - * encryption is enabled, because it may give an attacker
> > - * useful information.
> > + * We use fallocate to manipulate the space mappings used by the image
> > + * a.k.a. discard/zerorange. However we do not support this if
> > + * encryption is enabled, because it may give an attacker useful
> > + * information.
> > */
> > struct file *file = lo->lo_backing_file;
> > - int mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE;
> > int ret;
> > + mode |= FALLOC_FL_KEEP_SIZE;
> > +
> > if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) {
> > ret = -EOPNOTSUPP;
> > goto out;
> > @@ -596,9 +598,17 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq)
> > switch (req_op(rq)) {
> > case REQ_OP_FLUSH:
> > return lo_req_flush(lo, rq);
> > - case REQ_OP_DISCARD:
> > case REQ_OP_WRITE_ZEROES:
> > - return lo_discard(lo, rq, pos);
> cxz ÿbvVBV
Yes.
> > + case REQ_OP_DISCARD:
> > + return lo_fallocate(lo, rq, pos, FALLOC_FL_PUNCH_HOLE);
>
> I get lost in the twisty passages. What happens if the filesystem hosting the
> backing file doesn't support fallocate, and REQ_OP_DISCARD / REQ_OP_WRITE_ZEROES
> returns EOPNOTSUPP - discard is advisory, is it ok to fail REQ_OP_WRITE_ZEROES?
> Does something at another layer fall back to writing zeros?
If the REQ_OP_WRITE_ZEROES request was initiated by blkdev_issue_zeroout
and we send back an error code, blkdev_issue_zeroout will fall back to
writing zeroes if BLKDEV_ZERO_NOFALLBACK wasn't set its caller.
Note that calling FALLOC_FL_ZERO_RANGE on a block device will generate
a REQ_OP_WRITE_ZEROES | REQ_OP_NOUNMAP request, which means that it will
try fallocate zeroing and fall back to writing zeroes.
--D
>
> -Eric
>
> > case REQ_OP_WRITE:
> > if (lo->transfer)
> > return lo_write_transfer(lo, rq, pos);
> >
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v3] loop: fix no-unmap write-zeroes request behavior
2019-10-14 15:50 ` [PATCH v3] " Darrick J. Wong
2019-10-14 16:39 ` Eric Sandeen
@ 2019-10-15 7:58 ` Christoph Hellwig
1 sibling, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2019-10-15 7:58 UTC (permalink / raw)
To: Darrick J. Wong
Cc: Jens Axboe, Christoph Hellwig, linux-block, linux-fsdevel, xfs
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-10-15 7:58 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-10 17:02 [PATCH] loop: fix no-unmap write-zeroes request behavior Darrick J. Wong
2019-10-11 7:51 ` Christoph Hellwig
2019-10-11 16:05 ` [PATCH v2] " Darrick J. Wong
2019-10-14 7:28 ` Christoph Hellwig
2019-10-14 15:50 ` [PATCH v3] " Darrick J. Wong
2019-10-14 16:39 ` Eric Sandeen
2019-10-14 17:00 ` Darrick J. Wong
2019-10-15 7:58 ` Christoph Hellwig
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.